AI Models & Infra

Demystifying the LLM Training with the Fully Open-Source Moxin 7B Model

Speakers

Yanzhi Wang

Date / Time

2024-10-17

14:00

Presentation Slides

Presentation Video

YouTube

Recently, Large Language Models (LLMs) have undergone a significant transformation, marked by a rapid rise in both their popularity and capabilities. Open-source LLMs, such as LLaMA and Mistral, have made great contributions to the ever-increasing popularity of LLMs due to the ease to customize and deploy the models across various applications. Although LLMs offer unprecedented opportunities for research and innovation, its commercialization has raised concerns about transparency, reproducibility, and safety. Many open LLM models lack the necessary components (such as training code and data) for full understanding and reproducibility, and some use restrictive licenses whilst claiming to be “open-source”, which may hinder further innovations on LLMs. To mitigate this issue, we follow the Model Openness Framework (MOF), a ranked classification system that rates machine learning models based on their completeness and openness, following principles of open science, open source, open data, and open access. We present a truly open source LLM Moxin 7B and release pre-training code and configurations, training and fine-tuning data, and intermediate and final checkpoints, aiming to make continuous commitments to fully open-source LLMs.