top of page

Multimodal Polymer Property Prediction

Using the Lumee-7B and Lumee-MM (multimodal) model variants, we developed a polymer property prediction system, a fusion architecture integrating natural language and molecular structure. This project demonstrates how domain-specific AI can be rapidly deployed using Lumee’s foundation models — even on limited data — to outperform traditional machine learning methods.

May 29, 2025

Hasan Kurşun

Overview

Using the Lumee-7B and Lumee-MM (multimodal) model variants, we developed a polymer property prediction system, a fusion architecture integrating natural language and molecular structure. This project demonstrates how domain-specific AI can be rapidly deployed using Lumee’s foundation models — even on limited data — to outperform traditional machine learning methods.

🔧 Method

We used:

  • Textual embeddings from Lumee-7B to encode polymer SMILES/PSMILES representations

  • 3D structural embeddings from cheminformatics models (e.g., Uni-Mol) integrated via Lumee-MM

  • LoRA tuning to adapt models to 29k+ experimental and DFT-labeled data points for 22 polymer properties

The result was a multimodal fusion pipeline that supported rich reasoning from both language and structure — ideal for property regression and scientific discovery.

🧠 Key Properties Predicted

  • Glass Transition (Tg), Melting Point (Tm), Band Gap (Egc/Egb)

  • Mechanical: Young’s Modulus, Strength, Elongation

  • Chemical: Density, Refractive Index, Conductivity

  • Gas Permeabilities: CO₂, O₂, N₂, CH₄, He, H₂

📊 Performance Highlights

Property

R² Score (Lumee based)

Glass Transition Temp (Tg)

0.89 ✅

Band Gap (Egc)

0.92 ✅

Density

0.82 ✅

Atomization Energy

0.96 ✅

Gas Permeability (CH₄)

0.87 ✅

Compared to classical models, Lumee-based fusion models matched or exceeded SOTA benchmarks on most properties, even without pretraining on massive polymer datasets.

🔍 Why Lumee Worked

  • LLM chemical knowledge embedded during pretraining on scientific corpora

  • Long-context support (128k tokens) enabled full-molecule and sequence reasoning

  • Modular fusion-ready design via Lumee-MM accelerated experimentation

  • Token-level interpretability supported explainable AI insights

🚀 Applications

  • AI-assisted polymer discovery

  • Predictive simulation tools for materials R&D

  • Educational or institutional research assistants

  • Industry adoption in sustainable materials, electronics, coatings, and biomedical sectors

📩 Interested in applying Lumee to your research or chemistry pipelines?

hello@lumees.io


Scientific paper will be available soon.

Sources
bottom of page