Google reveals lightning-fast AI training method.



New AI Training Technique Is Drastically Faster, Says Google

TLDR:

Google researchers have developed a new AI training method called JEST that is up to 13 times faster and 10 times more efficient, reducing computational resources and potentially lowering energy demands in AI development.

New AI Training Technique Is Drastically Faster, Says Google

Google’s DeepMind researchers have unveiled a new method to accelerate AI training, significantly reducing the computational resources and time needed to do the work. This new approach to the typically energy-intensive process could make AI development both faster and cheaper, according to a recent research paper—and that could be good news for the environment.

The study introduces the “multimodal contrastive learning with joint example selection” (JEST) approach, which surpasses state-of-the-art models with up to 13 times fewer iterations and 10 times less computation.

Large-scale AI systems like ChatGPT are known for their high energy consumption, demanding major processing power and water for cooling. However, the JEST approach optimizes data selection for training, reducing the number of iterations and computational power required to train models, potentially lowering overall energy consumption.

How JEST works

JEST operates by selecting complementary batches of data to maximize the AI model’s learnability. This method considers the composition of the entire dataset, unlike traditional approaches that select individual examples independently. By using multimodal contrastive learning, the JEST process identifies dependencies between data points, improving the speed and efficiency of AI training while requiring less computing power.

The key to the approach is starting with pre-trained reference models to steer the data selection process, allowing the model to focus on high-quality datasets. The study’s experiments demonstrated significant performance gains, particularly in learning speed and resource efficiency when training on benchmark datasets like WebLI.