Daniel Kornev’s Post

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder

1mo

Super proud of Mike and his team. Many people asked me over the last year why long context and reasoning are so important. My answer is simple: long context is, in other words, a much better memory that allows LLM to utilize its ability to comprehend incoming text fully which as we all know is a million miles better than RAG. A very clear use case is with processing huge documentation like corporate wiki. Now, the deal with reasoning over long context is also simple: if models can get an ability to process long context but do that poorly then what's the value is in doing? And that's why a proper benchmark like BABILong is so desperately needed. It helps you to open your eyes on how inefficent the majority of the modern methods for processing long context by LLMs are. It helps you to understand what the real goal is. Let's make LLMs understand long context better together!

Mikhail Burtsev

Landau AI Fellow

1mo

Working hard with Yura Kuratov and Aydar Bulatov on BABILong benchmarking popular LLMs for Reasoning-in-a-Haystack over long contexts.

2 Comments

Akshay Jain

Founder and Managing Director at DNA Growth

Great point on the importance of long context for LLMs, Mike - it really magnifies the difference a comprehensive benchmark like BABILong can make in evaluating their efficacy.

1 Reaction

Sean Taylor

1mo

great! I hope all is going well, Daniel!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
7h
Report this post
But it's not just the size that matters, it's understanding, too!
Mikhail Burtsev

Landau AI Fellow
1d

New record in long context processing - 50 000 000 tokens!!! Associative Recurrent Memory Transformer (ARMT) is based on transformer self-attention for local context and segment-level recurrence for storage of task specific information distributed over a long context. We demonstrate that ARMT outperforms existing alternatives, including state space models like MAMBA, in associative retrieval tasks and sets a new performance record in the recent BABILong multi-task long-context benchmark by answering single-fact questions over 50 million tokens with an accuracy of 79.9%. Read and upvote on HF - https://lnkd.in/ef7eREDJ Paper accepted to Next Generation of Sequence Modeling Architectures workshop [ICML] Int'l Conference on Machine Learning Great work by Ivan Rodkin + Yura Kuratov + Aydar Bulatov ! Make recurrence and memory great again! Jürgen Schmidhuber ;) #AI #ICML #LLM #NLP
Like Comment
To view or add a comment, sign in
Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
1w
Report this post
A great job indeed. It's an honor to work with Tim Grace and his team!

Shereen Agrawal

Executive Director, Center for Software Innovation
1w

Cool job alert 🚨 Techstars Columbus Powered by The Ohio State University is hiring a Program Associate. For anyone who has interacted with this program, you know this team is incredible and gets things done. In this role you’ll have a major impact and work closely with the cohort companies, mentors, community, campus, and more. If you’re passionate about startups and founders and are detail oriented, check it out! https://lnkd.in/ePhgZ3zN

Program Associate, Techstars Columbus powered by The Ohio State University

boards.greenhouse.io
Like Comment
To view or add a comment, sign in
Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
1w
Report this post
Super excited to have Nick joining us at Sentius, continuing our momentum after our Demo Day at Techstars Columbus Powered by The Ohio State University! Nick Pachnev, welcome to the team!

Nick Pachnev

Gen AI | Process Automation | Fintech | Retailtech
1w

I'm excited to announce that I've joined Sentius as Chief Revenue Officer, in charge of Business Development and Sales. I'm eager to work with a team of leading AI researchers and engineers at the forefront of the Industry 4.0 revolution. Sentius transforms how organizations store, organize, and act upon rapidly evolving information flows. Its solutions streamline complex business processes by integrating cutting-edge generative AI research with advanced LLMs, autonomous agents, and adaptive process automation tools. This enterprise-grade data platform is easily deployable inside organizations and across entire ecosystems, including suppliers, customers, partners, regulators, employees, and existing IT solutions. Sentius has a top-tier R&D team that developed a model that overcomes major LLM drawbacks: hallucinations, limited reasoning capabilities, and a short context window. The applications are limitless, spanning manufacturing, financial services, construction, legal, transportation, energy management, healthcare, and many more. Exciting times ahead!

This content isn’t available here

Access this content and more in the LinkedIn app
Like Comment
To view or add a comment, sign in
Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
1w
Report this post
Superb!
Ksenia Petukhova

NLP researcher
2w

Had an amazing week at NAACL in Mexico! I had the opportunity to present two papers at SemEval: 1. Can Linguistics Capture the Specifics of LLM-generated Text? (https://lnkd.in/dx2KmDwV) 2. Advancing Emotion Classification with an LLM for Emotion-Cause Pair Extraction in Conversations (https://lnkd.in/duWQxPAA)
Like Comment
To view or add a comment, sign in
Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
2w
Report this post
Awesome to see BABILong helpful in deciding what model to use.
Kirill Shcherbakov

ML Engineer @ JetBrains AI
3w Edited

💡 I remember how helpful the evals of GPT-4-Turbo and GPT-4o on BABILong were in deciding whether to replace GPT-4-Turbo with a new flagship GPT-4o when it was released, offering SOTA on almost all eval benchmarks and x2 less costs - https://lnkd.in/e4JTxrzs The answer wasn't that straightforward since it was shown that the performance varies based on your use case - simple Q&A with short / mid-size / long contexts. And the brand new GPT-4o doesn't always win. 🔥 Today, they released LLMs evals on long context (up to 10M+) reasoning dataset BABILong - https://lnkd.in/eqBQcqxg 🗝 The findings are quite interesting: - Popular open-source LLMs, as well as GPT-4 and RAG, show that their performance heavily relies on the first 5-25 % of the input => highlights the need for improved context processing mechanisms - Fine-tuning boosts performance for GPT-3.5-Turbo and Mistral-7B, but context lengths remain limited (16K and 32K, respectively) - Mamba (130M) and RMT (137M) achieved the strongest results: RMT can process lengths up to 11 million tokens, while Mamba struggles beyond 128K tokens => shows that these challenges are indeed solvable Credits for pics to Mikhail Burtsev
1 Comment
Like Comment
To view or add a comment, sign in
Daniel Kornev

CEO at Sentius | Techstars'24 | Microsoft Alumni'10 | Xoogler'11 | 2nd-time Founder
3w
Report this post
This

Eric Horvitz

Chief Scientific Officer of Microsoft
4w

Key concepts, opportunities, and directions on global governance of AI technologies.

AI Across Borders: The Future of Global Governance

Microsoft On the Issues on LinkedIn
Like Comment
To view or add a comment, sign in

4,301 followers

View Profile Follow

Daniel Kornev’s Post

More from this author

Building Blocks for AI Assistants - Part III: Autonomous Agents: Origins (1986-2022)

DeepPavlov Library 0.11.0 release

From Palantir to New Type of Business

Explore topics