
Prev | Project Journal Entries | Github | Next
Key learnings:
As my training instructions continue to grow I decided to add a timer to the initial instructions given to the LLM at the beginning of a conversation (when we manually restart the LLm with our Restart LLM button). The local Llama 3.2 model was taking between 4 to 10 minutes to ingest the initial instructions whenever they were changed (it must implement some sort of caching because when the exact same instruction set is provided shortly afterwards, it takes only a matter of seconds). But given the increasing time required to ingest new instructions, and the incresing difficulty I was having getting the LLM to understand more complex instructions, I decided to make the required mods to the bot app to be able to utilize more powerful models via a cloud API. I first tried the DeepSeek 3.1 model (37B active parameters) but found that it constantly returned HTTP 429 "Too Many Request" responses, even when I throttled the bot-app to wait 15 seconds between requests (I built an environment setting to add throttle time between requests). The second larger model I tried was the xAI: Grok 4 Fast (free) model. Wow! It works so quickly, ingesting the instructions in seconds and it is so much better at reasoning than Llama 3.2 was. I already can't believe what I've been able to get the bot app to do today now.
The past few days on the project were mostly spent reverse engineering more of the memory addresses to see what interesting things I could get the agent to do. I figured out where the characters X and Y coordinates were stored, and started mapping some of the locations of interest. I also figured out, with the help of online documentation, that which enemies will be encountered on the overworld map is determined by a grid system. With that knowledge I have been able to train the LLM to tell me which enemies are nearby, based on where the character is located either on the map or in a dungeon. Since the LLM is given knowledge about where the X and Y coordinates are in memory, and what some of those locations mean, I tried asking it to warp me to a particular location, and it figured out how to do that all on it's own! So cool.
I also learned today that when you do pay for more powerful LLMs, you often pay by the number of both input and output tokens. It must be the case that a good agent developer will minimize the number of tokens needed for any given task. I found a tool from OpenAI for counting tokens (https://platform.openai.com/tokenizer). At the end of day today, my initial instruction set for the LLM contains over 10,000 tokens. This definitely seems too high and I will need to find more ways to offload some of the work from the LLM to my app going forward.
Prev | Project Journal Entries | Github | Next