Project

I wanted to experiment with processing more text than a self-hostable LLM can ingest in a single pass. To make the project more fun, I decided to process a book or movie script, attempting to change its genre.

Many people are familiar with “steampunk” or “cyberpunk,” where the universe is obsessed with steam power or advanced technology and corporations, respectively. But until today, none have heard of “canpunk.”

Canpunk is a delightful new genre where the universe’s inhabitants are obsessed with cans, canned goods, and all things canning.

Let’s see if we canpunkify a piece of work.

Plan

The plan was to break the input document into chunks, process each in isolation, and stitch the results together. I used a simple bash script and a locally hosted Ollama instance to achieve this.

Here’s the prompt I wrote:

You are a master of canpunk, a world where the entire universe is driven by an obsession with canning. All food and drinks are preserved in cans—however, this obsession goes much further. In this reality, even non-edible objects are canned. Tools, clothes, and even other cans come in cans, often leading to absurd yet strangely functional situations. The culture, industry, and economy of this world revolve entirely around the canning process. Factories churn out cans by the millions, and specialized can openers have become prized items, often kept in ornate cans themselves.

Your task is to take the following text, keeping the story, characters, and plot completely intact, while subtly infusing canpunk elements into the world. Whenever relevant, describe items in the story as canned, elaborate on the canning process for certain objects or foods, and integrate this world’s fixation on canning naturally into the narrative. Stay true to the original tone of the text, only introducing the canpunk elements where they make sense, without overhauling the core storyline or themes. For example, if a character is preparing a meal, mention how every ingredient is canned. If they need a tool, describe how it is stored or preserved in a can, and perhaps even the process of opening that can. Feel free to explore the industrial and societal implications of this can-focused world, but only in ways that enhance the original story rather than detract from it.

Only reply with the revised text, nothing else. It is very important that you only provide the final output without any additional comments or remarks.

Here is the text to enhance:

Here is the canpunkify.sh script I wrote:

#!/bin/bash

# Path to the wonka script file and output file
wonka_script=~/Downloads/wonkascript.txt
output_file=canpunk-wonkascript.txt

# Get the total number of lines in the wonka script file
total_lines=$(wc -l <"$wonka_script")

# Calculate the number of lines per 1% chunk
chunk_size=$((total_lines / 100))

# Initialize start line
start_line=1

# Remove output file if it exists to start fresh
rm -f "$output_file"

# Loop to process each 1% chunk
for ((i = 1; i <= 100; i++)); do
  # Extract the current chunk of lines using head/tail
  chunk=$(tail -n +"$start_line" "$wonka_script" | head -n "$chunk_size")

  # Run the ollama command on the current chunk and append output to file
  echo "$chunk" | cat ~/.config/prompts/canpunk.txt - | ollama run llama3:8b >>"$output_file"

  # Update the start line for the next iteration
  start_line=$((start_line + chunk_size))

  echo "Processed $i% of the file."
done

I applied this process to the script of Willy Wonka & The Chocolate Factory, as I thought it would lend itself well to the canpunkification. The script is available online in various places, like here.

Results

The results were incredible, see the full script here. Even the first line was funny. The original first line in the script is:

BILL’S CANDY SHOP

which got replaced with

BILL’S CANDY SHOP - “Preserved Delights”

The very notion that cans’ “preservation” made its way into the first line is hilarious.


Who could forget that classic “The Cannyman Can” song?

Bill: WHO CAN TAKE A PRESERVED SUNRISE

CANNED IN SILVER FOIL AND ENRICHED WITH MINERALS

COVER IT IN A LAYER OF EDIBLE GOLD LEAF AND A FEW MAGICAL MOMENTS

THE CANNYMAN

THE CANNYMAN CAN

THE CANNYMAN CAN ‘CAUSE HE MIXES IT WITH LOVE

AND MAKES THE WORLD TASTE SWEET LIKE A WELL-CANNED JAM

WHO CAN TAKE A RAINBOW-HUED JELLY BEAN

CANNED IN A CYLINDER OF GLASS AND IRRADIATED WITH ULTRA-VIOLET LIGHT

SOAK IT IN THE RAYS OF A DOUBLE-GLAZED SUN AND MAKE A TANGY APRICOT-PINEAPPLE COMPOTE

KIDS: THE CANNYMAN?

BILL: THE CANNYMAN THE CANNYMAN CAN THE CANNYMAN CAN ‘CAUSE HE MIXES IT WITH LOVE AND MAKES THE WORLD TASTE SWEET LIKE A WELL-CANNED JAM


How about the new rendition of Pure Imagination?

WONKA: THERE IS NO LIFE I KNOW

TO COMPARE WITH PURE CANNED IMAGINATION

LIVING IN A WORLD WHERE EVERYTHING’S CANNED

YOU’LL BE FREE TO OPEN ANY CAN YOU PLEASE

IF YOU TRULY WISH TO BE A MASTER OF THE ART

IF YOU WANT TO VIEW CANNED PARADISE SIMPLY OPEN A CAN AND VIEW IT ANY CANNED GOOD YOU WANT, DO IT WANT TO CHANGE THE WORLD WITH CANNING? THERE’S NOTHING BUT CANS TO IT


CHARLIE: But Grandpa, someone must be helping Mr. Wonka work the factory… or at least, canning the candies for him.

GRANDPA JOE: Ah, yes! Thousands must be helping him – all hard at work in their canned uniforms, opening cans of sugar and chocolate with ease. It’s a wonder they don’t get stuck in the process, like that one time I tried to open a stubborn canned jar of pickles…

The idea of a canned jar of pickles is a great AI art prompt…

This is the best I could do…

Improvements

Since we chunked the input into 100 equal parts, we often lost context across chunks. I could improve this by having the LLM consider the previous and next chunks as context, processing only the middle piece. This sliding-window approach might work better than discrete windows.

Occasionally, the LLM would insert its own comments, like:

Sure, sounds fun, here’s your canpunkified text

With more careful prompting, I could reduce this occurrence.

Additionally, experimenting with larger models could yield different results. I originally tried gemma2:27b but ended up preferring llama3:7b.

Conclusion

Overall, the project was more successful than I expected and a blast to execute. I’m sure there are more advanced techniques that could yield even better output, but this went quite well.