Curriculum learning–based LLM shows benefits of step-by-step reasoning in AI systems

New LLM LlamaV-o1, trained with curriculum learning, shows benefits of step-by-step reasoning in AI systems — The figure illustrates a comprehensive dataset structure designed to evaluate diverse tasks across multiple domains. Credit: *arXiv* (2025). DOI: 10.48550/arxiv.2501.06186

A team of AI researchers at Mohamed bin Zayed University of AI, in Abu Dhabi, working with a colleague from the University of Central Florida, has developed a curriculum learning–based LLM, called LlamaV-o1, that its makers claim shows the benefits of step-by-step reasoning in AI systems. In their study, published on the arXiv preprint server (and also on GitHub), the group built their LLM with a new level of step-by-step reasoning to understand how it arrives at its answers.

Curriculum learning, as it relates to AI, is a training strategy whereby an LLM is gradually exposed to more complex tasks as it attempts to solve a problem, similar to the way humans learn. In this new study, the team in Abu Dhabi has emphasized this approach as part of the way that its LLM attempts to form an answer to a query.

The approach follows their overall goal of making the process by which an LLM arrives at an answer more transparent to the person who posed the query. Aligned with that goal, the same team has also released VRC-Bench, which, as its name suggests, is a benchmark that was designed to test AI models on how well they reason their way through a problem as they search for an answer. The main difference between VRC-Bench and other benchmarks currently in use is its focus on testing AI models based on their step-by-step approach to solving queries.

One of the hallmarks of LlamaV-o1, the team notes, is that it outlines the reasoning steps it takes as it seeks an answer. This feature, they suggest, is becoming more important as LLMs and other AI models are deployed in critical applications such as medicine and financial forecasting. Following the logic helps boost confidence in the final answer or highlights when an error occurs.

Another feature is the use of Beam Search, which is a type of decoding algorithm used with LLMs to generate coherent and contextually appropriate text. In this case, it allows LlamaV-o1 to generate multiple reasoning paths and to select the one most appropriate for answering the original query—resulting in improved accuracy.

More information:
Omkar Thawakar et al, LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs, arXiv (2025). DOI: 10.48550/arxiv.2501.06186

LlamaV-o1: mbzuai-oryx.github.io/LlamaV-o1/

Journal information:
arXiv

Citation:
LlamaV-o1: Curriculum learning–based LLM shows benefits of step-by-step reasoning in AI systems (2025, January 14)
retrieved 14 January 2025
from https://techxplore.com/news/2025-01-llamav-o1-curriculum-learningbased-llm.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.