Deconstructing Dialogue Dollars

Estimating LLM Conversation Costs

Graduate Course: Teaching with Generative AI

Learning Objectives

After this activity, you'll be able to:

  • Define input and output tokens in LLM conversations
  • Explain how conversation history contributes to costs
  • Apply a simplified rule to estimate token counts
  • Calculate the estimated cost of a conversation
  • Compare relative costs of input vs. output tokens
  • Discuss factors influencing conversation cost

Why Token Costs Matter for Educators

  • Budgeting for classroom AI tools
  • Designing cost-effective activities
  • Teaching students responsible AI use
  • Choosing appropriate models
  • Understanding scaling costs
Cost visualization

What is a Token?

A token is the basic unit of text that LLMs process

Examples:

  • "Hello" = 1 token
  • "Supercalifragilisticexpialidocious" = 5 tokens
  • "I love teaching" = 3 tokens

For today's activity: 1 token ≈ 0.75 words

Or: Tokens ≈ Word Count × 1.33

Note: Real tokenization is more complex, but this approximation works for our exercise

Input vs. Output Tokens

Input Tokens

  • Everything sent TO the model
  • Includes the entire conversation history
  • Current user message
  • System prompts (if any)

Output Tokens

  • Everything generated BY the model
  • The assistant's reply
  • Usually charged at a higher rate

Billing Rates for Today's Activity

Model Input Cost Output Cost
Model A (Balanced) $0.50 / Million tokens $1.50 / Million tokens
Model B (High-Performance) $10.00 / Million tokens $30.00 / Million tokens

These rates are simplified for our exercise. Real rates vary by provider and model.

The Critical Insight

As conversations get longer, input costs grow non-linearly!

Turn 1: User sends message (50 tokens)
       → Input: 50 tokens
       → Assistant responds (75 tokens)

Turn 2: User sends new message (40 tokens)
       → Input: 50 + 75 + 40 = 165 tokens
       → Assistant responds (80 tokens)

Turn 3: User sends new message (30 tokens)
       → Input: 50 + 75 + 40 + 80 + 30 = 275 tokens
       → Assistant responds (70 tokens)
            

This is why we need to track cumulative input tokens!

Sample Transcript

## AI Tutoring Session: Solving Quadratic Equations

**User (Turn 1):**
Hi! I need help understanding how to solve quadratic equations like x^2 + 5x + 6 = 0.

**Assistant (Turn 1):**
Hello! I can definitely help with that. Quadratic equations are typically in the form ax^2 + bx + c = 0. 
Yours fits this perfectly! There are a few common methods to solve them: factoring, completing the square, 
or using the quadratic formula. Which method would you like to start with, or would you like an overview 
of factoring first?

**User (Turn 2):**
Factoring sounds easiest. How does that work for x^2 + 5x + 6 = 0?

**Assistant (Turn 2):**
Great choice! For factoring x^2 + 5x + 6 = 0, we're looking for two numbers that MULTIPLY to 'c' (which is 6) 
and ADD up to 'b' (which is 5). Can you think of two numbers that do that?

...
          

Your Task

  1. Form groups of 3-4 students
  2. Read through the provided transcript
  3. Complete the worksheet:
    • Count words in each message
    • Estimate tokens (Words × 1.33)
    • Track cumulative input tokens
    • Calculate costs using the provided rates
  4. Discuss the analysis questions

Time: 30-40 minutes

Analysis Table

Turn # Who Word Count Est. Tokens (This Message) Cumulative Input Tokens Output Tokens
1 User T_u1 T_u1 0
1 Assistant T_a1 T_u1 T_a1
2 User T_u2 T_u1 + T_a1 + T_u2 0
2 Assistant T_a2 T_u1 + T_a1 + T_u2 T_a2
... ... ...
Totals Sum of column Sum of column

Key insight: The "Cumulative Input Tokens" represents everything sent to the API for each turn

Cost Calculation

Input Cost = (Total Input Tokens ÷ 1,000,000) × Input Rate

Output Cost = (Total Output Tokens ÷ 1,000,000) × Output Rate

Total Cost = Input Cost + Output Cost

Example:

  • Total Input: 1,500 tokens
  • Total Output: 800 tokens
  • Model A rates: $0.50/M input, $1.50/M output
  • Input Cost: (1,500 ÷ 1,000,000) × $0.50 = $0.00075
  • Output Cost: (800 ÷ 1,000,000) × $1.50 = $0.0012
  • Total Cost: $0.00075 + $0.0012 = $0.00195
Cost breakdown

Analysis Questions

  1. Did this conversation spend more on input tokens or output tokens? Why?
  2. How does the cost per turn change as the conversation gets longer?
  3. What are the limitations of this estimation method?
  4. What are the implications of this cost structure for:
    • Designing AI-powered learning activities?
    • Budgeting for AI tools in education?
    • Teaching students to interact effectively with AI?
  5. How would costs change with Model B? What might justify using it?

Practical Strategies for Cost Management

Reduce Input Costs

  • Start new conversations when appropriate
  • Summarize previous context manually
  • Be concise in prompts
  • Use cheaper models for drafting

Reduce Output Costs

  • Request concise responses
  • Ask for bullet points instead of paragraphs
  • Break complex tasks into steps
  • Use specific instructions to avoid unnecessary output

Class Discussion

  • Share your group's estimated total cost
  • Did input or output cost more in your analysis?
  • How might this cost structure influence how you design AI activities for students?
  • What surprised you about the token estimation process?
  • How might you teach students about efficient AI interaction?

Time: 15-20 minutes

Possible Extensions

  • Compare costs between different conversation styles (Q&A vs. iterative dialogue)
  • Estimate costs across different models (GPT-3.5 vs. GPT-4)
  • Include system prompts in your calculations
  • Rewrite parts of the conversation to be more token-efficient
  • Create a spreadsheet model to automate these calculations

Thank You!

Questions?