Vectorview’s Post

View organization page for Y Combinator, graphic

948,794 followers

5mo

Sometimes, LLMs act in ways we didn't intend. To solve this problem, Vectorview (YC W24) is providing custom evaluation tasks for AI. It’s difficult to prevent unwanted behaviors in LLMs due to their non-deterministic nature. Testing them against every possible scenario is hard, making it tough to catch all unintended behaviors. Additionally, most evaluation benchmarks are too general, missing the specific issues that can arise in real-world use. Vectorview’s platform offers a suite of custom evaluation tools designed to benchmark AI applications against specific, real-world scenarios they are likely to encounter. This targeted approach ensures that AI behaves as intended, mitigating the risk of unintended behaviors that generic benchmarks often miss. The founders, Emil Fröberg and Lukas Petersson, believe that enabling access to custom evaluations at scale is the way to realize the full potential of AI. Congrats to the team on the launch!

Launch YC: Vectorview: Evaluating the capabilities of AI 🤖 | Y Combinator

ycombinator.com

34 Comments

John Donovan

CEO & Founder of TripRanger - Gamifying Travel | Retired at 26 from edTech | 12 Years full-time Traveler

5mo

Is this an actual tool being built (something automated) or just a horde of freelancers getting paid hourly to test the model, like the person i met today who does literally this for a company? I'm legitimately curious as my company could use this tool if it's actually automated.

2 Reactions

seth Ikiroma-Owiye

Fullstack Developer | Django, React, A.I Langchain

5mo

Creating a system prompt that gives expected results 100% of the time is a very challenging task. Can't wait to see how much of my time can be saved with this tool. Congrats guys!

2 Reactions

Alan Zabihi

🥷 Superagent (YC W24)

5mo

Let's do this! 🥷

2 Reactions

Umair liaqat

8+ Years in Full Stack Development | Specialized in Ruby on Rails | UX/UI Designer | Expert Front End Developer

5mo

This looks like a great product. Can I test it on chatgpt ?

Aatika S.

4mo

Cool tool, definitely useful for building safety measures into the process and reducing unintended behaviours! Especially as we're adopting more AI

Emma Renman

Helping research-based startups

5mo

Grattis Lukas Petersson och Emil Fröberg! 🌟 🚀

2 Reactions

Alexander Wikström

CEO & Founder @ Darwin | Building your next BDR & AI co-worker

5mo

🚀

1 Reaction

David Stålmarck

Engineering Mathematics student @ NUS, Lund || Business student @ Lund

5mo

🚀🚀

1 Reaction

Henrik Tingström

Student finance and grants. Reshaped.

5mo

🌟🌟

1 Reaction

Marvin Völter

Product, Software, AI & GreenTech | GSP @ Bosch | Ex Google, SAP | Software Engineering M.Sc. with distinction

5mo

Congratulations! I can't wait to see where the journey takes you!

1 Reaction

See more comments

To view or add a comment, sign in

More Relevant Posts

Vectorview

803 followers
3mo
Report this post
We’re hiring for multiple engineering positions, including a Founding Engineer. Come join us and work on some of the most exciting problems in AI 🤖 Read more here ⬇️

Work at Vectorview | Open Roles

thirty-seven.notion.site
Like Comment
To view or add a comment, sign in
Vectorview

803 followers
5mo
Report this post
AI: Enormous potential with even higher risks? Klarna recently announced that they have replaced 700 full-time employees with the use of AI. It showcases the huge potential of this technology. However, we have also seen massive catastrophes from companies rolling out their AI features without fully safety testing them. For example, we saw a chatbot making false promises that took a company to court. Are you not sure about the safety of your AI system? We would love to help!

1 Comment
Like Comment
To view or add a comment, sign in
Vectorview

803 followers
6mo
Report this post
We're happy to have been named one of the top generative AI Startups of 2024 by Y Combinator 🚀
Like Comment
To view or add a comment, sign in
Vectorview

803 followers
6mo
Report this post
We ran 1400 RAG experiments so you don't have to! Read our blog (👇) to find out how you can improve your RAG pipeline. https://lnkd.in/dBCQnVTv #RAG #llm #ai #llamaindex

optimizing-rag

vectorview.ai

1 Comment
Like Comment
To view or add a comment, sign in

803 followers

View Profile Follow

Vectorview’s Post

More Relevant Posts

Explore topics