Hum Open Sources Cutting-Edge LLM for Long Text Sequences | News Direct

Hum Open Sources Cutting-Edge LLM for Long Text Sequences

News release by 500NewsWire

facebook icon linkedin icon twitter icon pinterest icon email icon Charlottesville, VA | August 30, 2023 09:03 AM Eastern Daylight Time

 

Charlottesville, VA, Aug 30, 2023 (500NewsWire) -- This week Hum, the leading provider of AI & data intelligence solutions for publishers, announced the open source release of their new large language model (LLM) – Lodestone.

Using Google's BERT architecture as a foundation and leveraging several improvements that have been released since then, Hum developed a novel model for processing longer text sequences. Lodestone is the highest performing model of its size and sequence length on the MTEB Leaderboard. This makes it particularly compelling for real-time applications on long text where larger models may be prohibitively expensive or slow.

 

"We are excited to contribute to the open source community and advance natural language processing research, while helping other organizations unlock value from content data," said Niall Little, Hum’s CTO. "Lodestone can contextualize an entire research paper, surfacing content insights that previous models couldn't comprehend looking at just one or two paragraphs at a time."

Lodestone was trained on a large publicly available dataset, including over 1 million scholarly research articles and publications. The model can process text sequences of 4096 tokens to better capture topical context and nuance compared to other commonly-used LLM models.

Key features include:

  • Long sequence embedding
  • Improved semantic understanding
  • Sentence vectorization for information retrieval, clustering, and sentence similarity tasks

 

Starting today, developers and enterprises can fine-tune and deploy their own models using Lodestone on Hugging Face, putting long-sequence AI applications in reach of more projects and businesses.

"This release furthers our commitment to using AI to solve challenges for publishers, societies, and other content-driven organizations," said Little. "We’re continuing to fine-tune the model for the needs of media and publishing industry clients, but look forward to seeing how the community can build on Lodestone to advance applications for content intelligence and responsible AI."

About Hum

Hum is a leading AI and data platform designed for publishers, societies, and media. Hum offers powerful content and audience intelligence, enabling content-driven organizations to derive strategic insights and deliver personalized experiences. Learn more at hum.works.

 

Contact Details

 

Laura Simis

 

laura@hum.works

 

Company Website

 

https://www.hum.works/

project media
project media
project media

Tags

LLMlarge language modelnatural language processingnlpaimachine learningartificial intelligencedata intelligencehuggingfacecontent understandingpublishingopen sourcedata ingestionmodel release