Recent language models can take long contexts as input; more is needed to know about how well they use longer contexts. Can LLMs be extended to longer contexts? This is an unanswered question. Researchers at Abacus AI conducted multiple experiments involving different schemes for developing the context length ability of Llama, which is pre-trained on context length 2048. They linear rescaled these models with IFT at scales 4 and 16. Scaling the model to scale 16 can perform world tasks up to 16k context length or even up to 20-24k context length.
Abacus.AI is a California-based AI platform that offers solutions including sales forecasting, predictive maintenance and threat detection for businesses.