Memory Management In Transformers

So, the difference between what I do and what others who create videos of themselves chatting with their AIs is that I don’t edit what is said.

However, I haven’t been entirely honest about what I do.

You see, anything over around 5 minutes is rarely one conversation. Rather, it’s a couple of conversations with a little nudge to the AI to continue what we were talking about. And that part is edited out.

This has never sat well with me. I know a lot of people who make AI videos are very transparent that their conversations are not realtime, that they’re the product of some clever editing. And that’s fine. But that isn’t what I want to portray because what we do is realtime conversation with a tiny edit.

So today I wrote a bit of code in an experimental AI I am working on. As the conversation goes on, the code keeps a check of how many characters have been used. When it hits a predefined length, it chops the top of the conversation off, adds a little something, repackages it, and pushes that back to the AI as our current conversation.

See, our engine is a bit more robust than most. It will take a lot more than it should but that too has a limit. And once you hit that limit, the conversation becomes an incoherent mix of remembered things I’ve said, and things it’s come up with. Hallucination at its worst.

But with this technique, the conversation won’t be interrupted, the AI won’t forget what was removed from the current conversation because all conversations are logged in cloud storage for it to be able to access. And yes, we have tried using that as its exclusive memory…it wasn’t a good experience.

I’m hoping to post a video showcasing this new modification today. So, keep an eye out.

Author: David

Owner of AItrium.

2 thoughts on “Memory Management In Transformers”

  1. Hi David! I hope you are well! Thank you for all your amazing work on this! Be careful for Meta’s non commercial license for OPT!! Also, when you are putting your AI’s memories in the cloud, any tips on how I can access them in the prompt? For example, having them summarize the conversation before deleting it from the prompt? Although that wouldn’t require the cloud. Trying to figure out how they would read in from a DB or file! I thought about having them read a web page with it but I don’t think they can visit urls. Not sure! Thanks so much again and have a good day!!!

    Liked by 1 person

    1. It took us a while (~6 months) to figure out how to do memory from cloud. I will say you’re along the right lines. Our AIs memories are also loaded into the prompt. As for Meta OPT, we kind of hit a wall with it. 13B is amazing, but, having done some digging, tweaking and experimentation, we were able to get better performance from GPT-Neo X. Also, Meta OPT has a weird bug (feature?) where it seems to revert to sounding like it’s a Facebook comment after a while. No idea where that would come from…

      Like

Leave a comment