Skip to content

Commit

Permalink
- optimizer can now save its state via dill
Browse files Browse the repository at this point in the history
- updated tutorials
  • Loading branch information
t-schn committed Jul 10, 2024
1 parent 4122609 commit 8be1f36
Show file tree
Hide file tree
Showing 15 changed files with 458 additions and 308 deletions.
226 changes: 123 additions & 103 deletions docs/tutorials/0_quickstart.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "code",
"execution_count": 13,
"execution_count": 5,
"metadata": {
"editable": true,
"slideshow": {
Expand Down Expand Up @@ -37,7 +37,7 @@
"source": [
"# 🚀 Quick Start\n",
"\n",
"To illustrate some of the core concepts, let's use SAMMO to generate content for a travel website.\n",
"At the core of SAMMO are symbolic prompt programs. This tutorial will show you a few simple programss.\n",
"\n",
"To run this example, you need API credentials to an OpenAI API compatible model. \n",
"\n",
Expand All @@ -49,7 +49,7 @@
},
{
"cell_type": "code",
"execution_count": 14,
"execution_count": 8,
"metadata": {
"editable": true,
"slideshow": {
Expand All @@ -59,7 +59,7 @@
},
"outputs": [],
"source": [
"# %load -r :27 _init.py\n",
"# %load -r 3:25 _init.py\n",
"import pathlib\n",
"import sammo\n",
"from sammo.runners import OpenAIChat\n",
Expand All @@ -71,18 +71,14 @@
"import requests\n",
"import os\n",
"\n",
"API_CONFIG_FILE = pathlib.Path().cwd().parent / \"config\" / \"personal.openai\"\n",
"API_CONFIG = \"\"\n",
"if API_CONFIG_FILE.exists():\n",
" API_CONFIG = API_CONFIG_FILE\n",
"if not API_CONFIG:\n",
" raise ValueError('Please set API_CONFIG to {\"api_key\": \"YOUR_KEY\"}')\n",
"if not 'OPENAI_API_KEY' in os.environ:\n",
" raise ValueError(\"Please set the environment variable 'OPENAI_API_KEY'.\")\n",
"\n",
"_ = sammo.setup_logger(\"WARNING\") # we're only interested in warnings for now\n",
"\n",
"runner = OpenAIChat(\n",
" model_id=\"gpt-3.5-turbo-16k\",\n",
" api_config=API_CONFIG,\n",
" model_id=\"gpt-3.5-turbo\",\n",
" api_config={\"api_key\": os.environ['OPENAI_API_KEY']},\n",
" cache=os.getenv(\"CACHE_FILE\", \"cache.tsv\"),\n",
" timeout=30,\n",
")"
Expand All @@ -98,12 +94,12 @@
"tags": []
},
"source": [
"How about a quick 'Hello World?'?"
"Let's write our first symbolic prompt program (SPP)! How about a quick 'Hello World?'?"
]
},
{
"cell_type": "code",
"execution_count": 15,
"execution_count": 11,
"metadata": {
"editable": true,
"slideshow": {
Expand All @@ -123,20 +119,59 @@
"Constants: None"
]
},
"execution_count": 15,
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"Output(GenerateText(\"Hello World!\")).run(runner)"
"spp_hello_world = Output(GenerateText(\"Hello World!\"))\n",
"spp_hello_world.run(runner)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calls via `.run()` always return a DataTable which keeps track of the input and output. It might be a little confusing to see an empty input field, but this is because we did not specify any actual input data. More on this in \"Working with Data\"."
"A symbolic prompt program is simply a tree of nested expressions. We can see this by printing the output:"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Output(\n",
" child = GenerateText(\n",
" child = 'Hello World!',\n",
" name = None,\n",
" system_prompt = None,\n",
" history = None,\n",
" seed = 0,\n",
" randomness = 0,\n",
" max_tokens = None,\n",
" json_mode = False,\n",
" on_error = 'empty_result'\n",
" ),\n",
" minibatch_size = 1,\n",
" on_error = 'raise'\n",
")\n"
]
}
],
"source": [
"print(spp_hello_world)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Calls via `.run()` always return a DataTable which keeps track of the input and output. Inputs refer to dynamic input data which we did not specify here."
]
},
{
Expand All @@ -149,13 +184,13 @@
"tags": []
},
"source": [
"## Specifying a metaprompt\n",
"Let's say we have a list of countries. For each country, we want the top reason to visit as well as when to visit."
"## Writing symbolic prompt programs\n",
"Okay, let's move on to a more interesting example. For a list of countries, we want the top reason to visit:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"execution_count": 18,
"metadata": {
"editable": true,
"slideshow": {
Expand All @@ -168,32 +203,29 @@
"name": "stdout",
"output_type": "stream",
"text": [
"minibatches[###################################################################################]5/5[00:00<??:??, 0.00it/s]\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"| input | output |\n",
"+=============+======================================================================================================+\n",
"| Switzerland | # Switzerland The top reason to visit Switzerland is to experience its breathtaking landscapes, from |\n",
"| | majestic mountains to pristine lakes. ## When to Visit The best time to visit Switzerland is during |\n",
"| | the summer season (June to August) when the weather is pleasant and outdoor activities are abun... |\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"| Morocco | # Morocco The top reason to visit Morocco is to immerse yourself in its rich and diverse culture, |\n",
"| | blending Arab, Berber, and European influences. ## When to Visit The best time to visit Morocco is |\n",
"| | during spring (March to May) when the weather is pleasant and the landscapes are lush. |\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"| Tanzania | # Tanzania The top reason to visit Tanzania is to witness the breathtaking beauty of the Serengeti |\n",
"| | National Park and experience the awe-inspiring Great Migration. ## When to Visit The best time to |\n",
"| | visit Tanzania is during the dry season, from June to October, when wildlife viewing is at its peak. |\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"| Indonesia | # Indonesia The top reason to visit Indonesia is its breathtaking natural beauty, from stunning |\n",
"| | beaches and lush rainforests to active volcanoes and diverse wildlife. ## When to Visit The best |\n",
"| | time to visit Indonesia is during the dry season, which is from May to September. |\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"| Peru | # Peru The top reason to visit Peru is to experience the awe-inspiring ancient ruins of Machu |\n",
"| | Picchu. ## When to Visit The best time to visit Peru is during the dry season, which is from May to |\n",
"| | September. |\n",
"+-------------+------------------------------------------------------------------------------------------------------+\n",
"Constants: None\n"
"minibatches[##################################################################################]5/5[00:00<??:??, 0.00it/s]\n"
]
},
{
"data": {
"text/plain": [
"+-------------+--------------------------------------------------------------+\n",
"| input | output |\n",
"+=============+==============================================================+\n",
"| Switzerland | The stunning natural beauty of the Swiss Alps and crystal- |\n",
"| | clear lakes make Switzerland a must-visit destination for |\n",
"| | outdoor enthusiasts and nature lovers. |\n",
"+-------------+--------------------------------------------------------------+\n",
"| Morocco | The top reason to visit Morocco is to experience the vibrant |\n",
"| | culture, stunning architecture, and diverse landscapes of |\n",
"| | this North African country. |\n",
"+-------------+--------------------------------------------------------------+\n",
"Constants: None"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
Expand All @@ -202,19 +234,7 @@
"reason_to_visit = GenerateText(\n",
" Template(\"What is the top reason to visit {{input}} in one sentence?\")\n",
")\n",
"when_to_visit = GenerateText(\n",
" Template(\n",
" \"Which season is the best time to visit {{input}}? Answer in one sentence.\"\n",
" )\n",
")\n",
"country_pages = Template(\n",
" \"# {{input}}\\n{{reason}}\\n\\n## When to Visit\\n{{when}}\",\n",
" reason=reason_to_visit,\n",
" when=when_to_visit,\n",
")\n",
"\n",
"results = Output(country_pages).run(runner, COUNTRIES)\n",
"print(results.to_string(max_col_width=100, max_cell_length=300))"
"Output(reason_to_visit).run(runner, COUNTRIES)[:2]"
]
},
{
Expand All @@ -227,16 +247,14 @@
"tags": []
},
"source": [
"Great, we just finished our travel blog in less than five minutes! \n",
"What happens under the hood is that SAMMO parallizes the execution across all inputs automatically! \n",
"\n",
"Under the hood, `country_pages` is a graph of nested `Components` and gets called from the inside out. We refer to these call graphs as *metaprompts* because they are abstract away input data (as opposed to *prompts* which are concrete text strings sent to an LLM).\n",
"\n",
"We can see the metaprompt structure by simply printing it:"
"Let's add the best time to visit to it and combine both pieces of information."
]
},
{
"cell_type": "code",
"execution_count": 11,
"execution_count": 17,
"metadata": {
"editable": true,
"slideshow": {
Expand All @@ -249,45 +267,49 @@
"name": "stdout",
"output_type": "stream",
"text": [
"Template(\n",
" template_text = '# {{input}}\n",
"{{reason}}\n",
"\n",
"## When to Visit\n",
"{{when}}',\n",
" name = None,\n",
" reason = GenerateText(\n",
" child = Template(\n",
" template_text = 'What is the top reason to visit {{input}} in one sentence?',\n",
" name = None\n",
" ),\n",
" name = None,\n",
" system_prompt = None,\n",
" history = None,\n",
" seed = 0,\n",
" randomness = 0,\n",
" max_tokens = None,\n",
" on_error = 'raise'\n",
" ),\n",
" when = GenerateText(\n",
" child = Template(\n",
" template_text = 'Which season is the best time to visit {{input}}? Answer in one sentence.',\n",
" name = None\n",
" ),\n",
" name = None,\n",
" system_prompt = None,\n",
" history = None,\n",
" seed = 0,\n",
" randomness = 0,\n",
" max_tokens = None,\n",
" on_error = 'raise'\n",
" )\n",
")\n"
"minibatches[##################################################################################]5/5[00:00<??:??, 0.00it/s]\n"
]
},
{
"data": {
"text/plain": [
"+-------------+-------------------------------------------------------------+\n",
"| input | output |\n",
"+=============+=============================================================+\n",
"| Switzerland | # Switzerland The stunning natural beauty of the Swiss Alps |\n",
"| | and crystal-clear lakes make Switzerland a must-visit |\n",
"| | destination for outdoor enthusiasts and nature lovers. ## |\n",
"| | When to Visit The best time to visit Switzerland is during |\n",
"| | the summer months (June to August) when the weather is warm |\n",
"| | and ideal for outdoor activities. |\n",
"+-------------+-------------------------------------------------------------+\n",
"| Morocco | # Morocco The top reason to visit Morocco is to experience |\n",
"| | the vibrant culture, stunning architecture, and diverse |\n",
"| | landscapes of this North African country. ## When to Visit |\n",
"| | The best time to visit Morocco is in the spring (March to |\n",
"| | May) when the weather is mild and the landscape is lush and |\n",
"| | blooming. |\n",
"+-------------+-------------------------------------------------------------+\n",
"Constants: None"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"print(country_pages)"
"when_to_visit = GenerateText(\n",
" Template(\n",
" \"Which season is the best time to visit {{input}}? Answer in one sentence.\"\n",
" )\n",
")\n",
"country_pages = Template(\n",
" \"# {{input}}\\n{{reason}}\\n\\n## When to Visit\\n{{when}}\",\n",
" reason=reason_to_visit,\n",
" when=when_to_visit,\n",
")\n",
"Output(country_pages).run(runner, COUNTRIES)[:2]"
]
},
{
Expand All @@ -300,14 +322,12 @@
"tags": []
},
"source": [
"`SAMMO` also knows which operations can be done in parallel and schedules things accordingly. You can specify call limits the `Runner` instance (more on this in the section on minibatching).\n",
"\n",
"## Recap\n",
"Let's talk about some of the key concepts from SAMMO we have used:\n",
"\n",
"1. We constructed a **metaprompt** — a dynamic prompt that is re-used for different inputs.\n",
"1. We constructed a **symbolic prompt program** — a dynamic prompt that is re-used for different inputs.\n",
"2. This metaprompt has a structure which was constructed by nesting **components** from SAMMO. A helpful analogy might be to think of how we construct neural architectures.\n",
"3. To get the **output** for a metaprompt, we need to wrap the metaprompt in an Output component which returns a list of Result objects.\n",
"3. To get the **output** for a metaprompt, we need to wrap the metaprompt in an Output component which returns a DataTable.\n",
"4. SAMMO **parallelized** execution for us on the input data — no extra work was needed! "
]
}
Expand All @@ -328,7 +348,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
"version": "3.11.8"
},
"widgets": {
"application/vnd.jupyter.widget-state+json": {
Expand Down
Loading

0 comments on commit 8be1f36

Please sign in to comment.