Programming Foundation Models with DSPy - Meetup Slides

: Programming, not Prompting Language Models
Shangyin Tan

It’s never been easier to prototype
impressive AI demos.

Turning monolithic LMs into reliable AI
systems remains challenging.
3

Turning monolithic LMs into reliable AI
systems remains challenging.
4

The DSPy paradigm - dspy.ai
Let’s program—not prompt—LMs.
Connect declarative modules into a computation graph, and compile it into a chain of optimized
prompts (or LM finetunes) automatically. How?
1. Hand-written Prompts Signatures: question -> answer long_document -> summary
2. Prompting Techniques and Chains Modules: dspy.ChainOfThought dspy.ReAct
3. Manual Prompt Engineering
a. Optimizers (given a metric you want to maximize)
b. Assertions (similar to assertions in programming language)

Let’s get concrete: Question Answering with HotPotQA
Question: How many storeys are in the castle David Gregory inherited?
Passages: (1) St. Gregory Hotel is a 9-floor boutique hotel… (2) Kinnairdy Castle
is tower house with five storeys…
Answer: Kinnairdy Castle has five storeys.

Let’s build three programs for this task. Program 1.
CoT = dspy.ChainOfThought("question -> answer")
How many storeys are in the castle
David Gregory inherited?
Answer: Castle Gregory has three storeys.
CoT
CoT
Question
Chain of Thought (Reasoning); Answer

Let’s build another program for this task. Program 2.
class RAG(dspy.Module):
def __init__(self, num_passages=3):
self.retrieve = dspy.Retrieve(k=num_passages)
self.generate_answer = dspy.ChainOfThought("context, question -> answer")
def forward(self, question):
passages = self.retrieve(question).passages
return self.generate_answer(context=passages, question=question)
Retriever
<ﬁnds relevant passages>
Answer
LM
Question
Retriever
St. Gregory Hotel is a 9-ﬂoor
boutique hotel…
St. Gregory Hotel has nine storeys.
LM

Let’s build a solid Program 3 for this task.
class MultiHop(dspy.Module):
def __init__(self, passages_per_hop=3):
self.generate_query = dspy.ChainOfThought("context, question -> query")
self.retrieve = dspy.Retrieve(k=passages_per_hop)
context = []
for _ in range(2):
query = self.generate_query(context=context, question=question).query
context += self.retrieve(query).passages
return self.generate_answer(context=context, question=question)
LM
What castle did David
Gregory inherit?
Retriever
David Gregory… inherited the
Kinnairdy Castle in 1664.
LM
How many storeys are
in Kinnairdy Castle?
Retriever
Kinnairdy Castle is tower
house with ﬁve storeys…
LM
Kinnairdy Castle has ﬁve storeys.

What does the DSPy Compiler do?
A Few
Examples
(labeled and/or unlabeled)
Improved
DSPy
Program
DSPy Program
DSPy Compiler
(a speciﬁc optimizer)
improved_dspy_program = optimizer.compile(dspy_program, few_examples)

context = []
for _ in range(2):
query = self.generate_query(context, question).query
return self.generate_answer(context, question)
Let’s compile our multi-hop program.

Basket of optimizers (and growing!)
fewshot_program = dspy.LabeledFewShot(k=8).compile(program, trainset=trainset)
teleprompter = dspy.BootstrapFewShotWithRandomSearch(metric=gsm8k_accuracy)
bootstrapped_program = teleprompter.compile(program, trainset=trainset)
ensemble = dspy.Ensemble(reduce_fn=dspy.majority).compile(bootstrapped_program.programs[:7])
bootstrappedx2_program = teleprompter.compile(program, teacher=bootstrapped_program, trainset=trainset)

GSM8K - grade school math dataset
20 birds migrate on a seasonal basis from one lake to another, searching
for food. If they ﬂy from lake Jim to lake Disney in one season, which is 50
miles apart, then the next season they ﬂy from lake Disney to lake London,
60 miles apart, calculate the combined distance all of the birds have
traveled in the two seasons.
Answer: 20 * (50 + 60) = 2200 miles

class SimpleMathSolver(dspy.Module):
def __init__(self):
self.prog = dspy.ChainOfThought("question -> answer")
return self.prog(question=question)
simple_math_solver = SimpleMathSolver()
First DSPy program

Given the fields `question`, produce the fields ànswer`.
---
Follow the following format.
Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}
---
Question: 20 birds migrate on a seasonal basis from one lake to another, searching for food. If they fly
from lake Jim to lake Disney in one season, which is 50 miles apart, then the next season they fly from
lake Disney to lake London, 60 miles apart, calculate the combined distance all of the birds have traveled
in the two seasons.
Reasoning: Let's think step by step in order to produce the answer. We know that the
birds fly 50 miles from lake Jim to lake Disney and then 60 miles from lake Disney to
lake London. To find the combined distance, we simply add the two distances
together.
Answer: The combined distance all of the birds have traveled in the two seasons is
50 miles + 60 miles = 110 miles.

Two problems:
1. Reasoning does not include all the information needed
2. Answer in complicated sentences, not a single number
LMs are not following what we expect.
And, there's no way to even specify the constraints except manual
prompt tuning.

Goal: Enable DSPy programmers to deﬁne constraints on LM behavior.
Can we make a robust, extensible programming construct?

Including Assertions in DSPy
dspy.Assert - DSPy must either pass the assertion or raise an Exception
dspy.Suggest - DSPy should try to pass the assertion, but permit the code to continue otherwise
dspy.Suggest(constraint: bool, instruction_message: str)

Introducing DSPy Assertions — to guide LM behavior in DSPy programs.
DSPy Assertions
Assertion-driven backtracking
Self-reﬁne LMs by providing feedback and past
erroneous response
Example selection and bootstrapping
Provide better demonstrations for in-context learning

def extract_number(question):
"""extract numbers from a question"""
def has_numbers(rationale, numbers):
"""whether rationale has all the numbers, if not, return the missing number"""
class SimpleMathSolverWithSuggest(dspy.Module):
def __init__(self):
pred = self.prog(question=question)
rationale_has_numbers, missing_number = has_numbers(pred.rationale, extract_number(question))
dspy.Suggest(rationale_has_numbers, f"Your Reasoning should contain {missing_number}.")
dspy.Suggest(len(pred.answer) < 10, "Your Answer should be a number.")
return pred
simple_math_solver_suggest = SimpleMathSolverWithSuggest().activate_assertions()
DSPy program with Assertions
Assertion-driven backtracking
Self-reﬁne LMs by providing feedback and past
erroneous response

def extract_number(question):
"""extract numbers from a question"""
def has_numbers(rationale, numbers):
"""whether rationale has all the numbers, if not, return the missing number"""
class SimpleMathSolverWithSuggest(dspy.Module):
def __init__(self):
pred = self.prog(question=question)
rationale_has_numbers, missing_number = has_numbers(pred.rationale, extract_number(question))
dspy.Suggest(rationale_has_numbers, f"Your Reasoning should contain {missing_number}.")
dspy.Suggest(len(pred.answer) < 10, "Your Answer should be a number.")
return pred
simple_math_solver_suggest = SimpleMathSolverWithSuggest().activate_assertions()

---
Previous Reasoning: past Reasoning with errors
Instructions: Some instructions you must satisfy
Answer: ${answer}
---
Question: 20 birds migrate on a seasonal basis from one lake to another, searching for food. If they fly from lake Jim
to lake Disney in one season, which is 50 miles apart, then the next season they fly from lake Disney to lake London,
60 miles apart, calculate the combined distance all of the birds have traveled in the two seasons.
Previous Reasoning: produce the answer. We know that the birds fly 50 miles from lake Jim to lake Disney and then
60 miles from lake Disney to lake London. To find the combined distance, we simply add the two distances together.
Instructions: Your Reasoning should contain 20.
Reasoning: Let's think step by step in order to calculate the combined distance all of the birds have
traveled in the two seasons. In the first season, 20 birds fly 50 miles from lake Jim to lake Disney,
which totals to 20 birds * 50 miles = 1000 miles. In the second season, they fly 60 miles from lake
Disney to lake London, which totals to 20 birds * 60 miles = 1200 miles. Therefore, the combined
distance all of the birds have traveled in the two seasons is 1000 miles + 1200 miles = 2200 miles.
Answer: The combined distance all of the birds have traveled in the two seasons is 2200 miles.

---
Previous Reasoning: past Reasoning with errors
Instructions: Some instructions you must satisfy
Answer: ${answer}
---
Question: 20 birds migrate on a seasonal basis from one lake to another, searching for food. If they fly from lake Jim
to lake Disney in one season, which is 50 miles apart, then the next season they fly from lake Disney to lake London,
60 miles apart, calculate the combined distance all of the birds have traveled in the two seasons.
Previous Answer: The combined distance all of the birds have traveled in the two seasons is 2200 miles.
Instructions: Your Answer should be a number.
Reasoning: Let's think step by step in order to produce the answer. We know that in the first season,
20 birds fly 50 miles from lake Jim to lake Disney, totaling 20 birds * 50 miles = 1000 miles. In the
second season, they fly 60 miles from lake Disney to lake London, totaling 20 birds * 60 miles =
1200 miles. Therefore, the combined distance all of the birds have traveled in the two seasons is
1000 miles + 1200 miles = 2200 miles.
Answer: 2200 miles

optimizer = BootstrapFewShotWithRandomSearch(
gsm8k_metric, max_bootstrapped_demos=3, max_labeled_demos=6, num_candidate_programs=6)
compiled_prog = optimizer.compile(student=simple_math_solver)
compiled_prog_suggest = optimizer.compile(student=simple_math_solver_suggest)
Example selection and bootstrapping
Provide better demonstrations for in-context learning
1. Demonstrations need to pass DSPy assertions, too.
2. Demonstrations can also contain traces with errors and ﬁxes.
3. Optimizer is "assertion-aware" - it also calculates how many assertion failure in
addition to "metric"

Results
Simple CoT CoT w Assertions Compiled CoT Compiled w Assertion
61.7 74.3 83.7 84.0
*Compiled w Assertion does not outperform Compiled on this task. Our
observations are: the harder assertions are, the better they are at
improving in-context learning demonstrations.

How could we use LM Assertions here?
HumanEval - code generation

class NaiveCodeGenerator(dspy.Module):
def __init__(self):
self.prog = dspy.ChainOfThought("prompt -> code")
def forward(self, prompt):
pred = self.prog(prompt=prompt)
return pred
First DSPy program

🤖Code
Generator
Prompt
🤖Test
Generator
suggest(check_tests(code, test), f"The
generated code failed the test {test}")
Code
Code Generation Pipeline w. Suggestions
Test
Code
🤖Code
Generator
Prompt
Code
Code Generation Pipeline
Code
UPDATED PROMPT WITH FEEDBACK
Prompt: . . .
Past Code: <previous attempt w/ errors> . . .
Instruction: The generated code failed the test
<test>, please fix it. . .

class NaiveCodeGenerator(dspy.Module):
def __init__(self):
self.prog = dspy.ChainOfThought("prompt -> code")
self.generate_test = dspy.ChainOfThought("prompt -> test")
def forward(self, prompt):
pred = self.prog(prompt=prompt)
tests = self.generate_test(prompt=prompt)
result, test, error = check_tests(pred, tests)
dspy.Suggest( result == "passed",
f"The generated code failed the test {test}, please fix {error}.",
backtrack_module = self.prog )
return pred
DSPy program with Assertions

Results
Naive Program Generator Program Generator w Assertions
70.7 75.6
*Issues with self-consistency:
1. Both tests and code went wrong in the same way
2. Generated tests are wrong, thus not giving useful feedback

Notebook: bit.ly/dspy-gsm8k
bit.ly/dspy-humaneval
bit.ly/dspy-intro
Twitter: @Shangyint

Multi-Hop Question Answering with HotPotQA with Suggestions
context = []
queries = [question]
for _ in range(2):
dspy.Suggest(len(query) < 100,
"Query should be less than 100 characters")
dspy.Suggest(is_query_distinct(query, queries),
f"Query should be distinct from {queries}")
queries += query
The queries should
be different from
previous ones
The query should be
concise

Multi-Hop Question Answering with HotPotQA with Suggestions
context = []
queries = [question]
for _ in range(2):
dspy.Suggest(len(query) < 100,
"Query should be less than 100 characters")
dspy.Suggest(is_query_distinct(query, queries),
f"Query should be distinct from {queries}")
queries += query
Context: …
Question: …
Past_Query: {previous attempt w/ errors}
Instructions: Query should be distinct from …
backtrack and regenerate query with new prompt
update prompt with feedback
Fail ❌

Context
🤖Query
Generator
Retriever
Question
🤖Answer
Generator
Answer
Query
Context + Previous Queries
✓ suggest(len(query) < 100, "Query should be less than 100 chars")
✗ suggest(is_query_distinct(query, prev_queries), f"Query should be
distinct from {prev_queries}")
UPDATED PROMPT WITH FEEDBACK
Context: . . .
Question: . . .
Past Query: <previous attempt w/ errors> . . .
Instruction: Query should be distinct from . . .
Multi-hop QA Pipeline

Programming Foundation Models with DSPy - Meetup Slides

Related slideshows

More Related Content

Similar to Programming Foundation Models with DSPy - Meetup Slides

Similar to Programming Foundation Models with DSPy - Meetup Slides (20)

More from Zilliz

More from Zilliz (20)

Recently uploaded

Recently uploaded (20)

Programming Foundation Models with DSPy - Meetup Slides