Show HN: Project AELLA – Open LLMs for structuring 100M research papers
aella.inference.netWe're releasing Project AELLA - an open-science initiative to make scientific knowledge more accessible through AI-generated structured summaries of research papers.
Blog: https://inference.net/blog/project-aella
Visualizer: https://aella.inference.net
Models: https://huggingface.co/inference-net/Aella-Qwen3-14B, https://huggingface.co/inference-net/Aella-Nemotron-12B
Highlights: - Released 100K research paper summaries in standardized JSON format with interactive visualization.
- Fine-tuned open models (Qwen 3 14B & Nemotron 12B) that match GPT-5/Claude 4.5 performance at 98% lower cost (~$100K vs $5M to process 100M papers)
- Built on distributed "idle compute" infrastructure - think SETI@Home for LLM workloads
Goal: Process ~100M papers total, then link to OpenAlex metadata and convert to copyright-respecting "Knowledge Units"
The models are open, evaluation framework is transparent, and we're making the summaries publicly available. This builds on Project Alexandria's legal/technical foundation for extracting factual knowledge while respecting copyright.
Technical deep-dive in the post covers our training pipeline, dual evaluation methods (LLM-as-judge + QA dataset), and economic comparison showing 50x cost reduction vs closed models.
Happy to answer questions about the training approach, evaluation methodology, or infrastructure!
Named after https://x.com/Aella_Girl perhaps?
Just saw this tweet, so i guess not:
https://x.com/samhogan/status/1988448512137457767
"Due to an unforeseen naming conflict, we are renaming Project AELLA to Project OSSAS (Open Source Summaries At Scale)"