Vectorview’s Post

Vectorview reposted this

View organization page for Y Combinator, graphic

948,794 followers

Sometimes, LLMs act in ways we didn't intend. To solve this problem, Vectorview (YC W24) is providing custom evaluation tasks for AI. It’s difficult to prevent unwanted behaviors in LLMs due to their non-deterministic nature. Testing them against every possible scenario is hard, making it tough to catch all unintended behaviors. Additionally, most evaluation benchmarks are too general, missing the specific issues that can arise in real-world use. Vectorview’s platform offers a suite of custom evaluation tools designed to benchmark AI applications against specific, real-world scenarios they are likely to encounter. This targeted approach ensures that AI behaves as intended, mitigating the risk of unintended behaviors that generic benchmarks often miss. The founders, Emil Fröberg and Lukas Petersson, believe that enabling access to custom evaluations at scale is the way to realize the full potential of AI. Congrats to the team on the launch!

Launch YC: Vectorview: Evaluating the capabilities of AI 🤖 | Y Combinator

Launch YC: Vectorview: Evaluating the capabilities of AI 🤖 | Y Combinator

ycombinator.com

John Donovan

CEO & Founder of TripRanger - Gamifying Travel | Retired at 26 from edTech | 12 Years full-time Traveler

5mo

Is this an actual tool being built (something automated) or just a horde of freelancers getting paid hourly to test the model, like the person i met today who does literally this for a company? I'm legitimately curious as my company could use this tool if it's actually automated.

seth Ikiroma-Owiye

Fullstack Developer | Django, React, A.I Langchain

5mo

Creating a system prompt that gives expected results 100% of the time is a very challenging task. Can't wait to see how much of my time can be saved with this tool. Congrats guys!

Alan Zabihi

🥷 Superagent (YC W24)

5mo

Let's do this! 🥷

Umair liaqat

8+ Years in Full Stack Development | Specialized in Ruby on Rails | UX/UI Designer | Expert Front End Developer

5mo

This looks like a great product. Can I test it on chatgpt ?

Like
Reply
Aatika S.

Women have brilliant ideas. I help turn them into reality! 🌱 | Founder @ Elevacity - helping women build & launch their first tech startup | Tech Leadership | Innovation | Join my newsletter ⬇️ | Tea Lover 🍵

4mo

Cool tool, definitely useful for building safety measures into the process and reducing unintended behaviours! Especially as we're adopting more AI

Like
Reply
Emma Renman

Helping research-based startups

5mo

Grattis Lukas Petersson och Emil Fröberg! 🌟 🚀

Alexander Wikström

CEO & Founder @ Darwin | Building your next BDR & AI co-worker

5mo

🚀

David Stålmarck

Engineering Mathematics student @ NUS, Lund || Business student @ Lund

5mo

🚀🚀

Henrik Tingström

Student finance and grants. Reshaped.

5mo

🌟🌟

Marvin Völter

Product, Software, AI & GreenTech | GSP @ Bosch | Ex Google, SAP | Software Engineering M.Sc. with distinction

5mo

Congratulations! I can't wait to see where the journey takes you!

See more comments

To view or add a comment, sign in

Explore topics