Towards Practical Multimodal Large Models
As the inevitable path towards general artificial intelligence, multimodal large models have shown great potential for intelligent transition. It is not only at the forefront of academic exploration but also a catalyst for building a community with a shared future for mankind and promoting global cooperation. High deployment and inference costs, frequent hallucination behaviors, and scarce high-quality data all greatly restrict the development of multimodal large models. Starting from these key issues, MiniCPM-V has for the first time achieved comparable end-side multimodal understanding capabilities to closed-source commercial multimodal large models.