TL;DR:
– Researchers at MIT have developed a solution to a problem that can cause AI chatbots to crash during long conversations.
– The team found that when the cache of a large language model is full, the first pieces of data are sometimes bumped out, causing the model to fail.
– The researchers developed a method called StreamingLLM, which allows a chatbot to maintain a continuous conversation without crashing, even when the conversation involves millions of words.
– StreamingLLM outperformed another method that avoids crashing by recomputing part of past conversations, making it more than 22 times faster.
– This development could enable chatbots to be used in a wider range of applications.
Researchers at MIT have developed a solution to a problem that can cause AI chatbots to crash during long conversations. The problem arises when the cache of a large language model, which acts as a conversation memory, becomes full. The first pieces of data are then bumped out, causing the model to fail. The researchers developed a method called StreamingLLM, which allows a chatbot to keep chatting without crashing or slowing down, no matter how long the conversation goes. They found that by ensuring that the first few data points remain in memory, the chatbot’s performance remains stable. The method, which outperformed another method that constantly recomputes part of past conversations, could allow chatbots to be used in tasks such as copywriting, editing, or generating code. The researchers described the method as “transformative” and “revolutionary” and plan to continue developing ways to improve chatbot’s ability to remember words and previous conversations.