Another great paper on In-Context Learning (ICL) for LLMs: https://lnkd.in/e4SP4K5d More and more, we're hearing from builders that ICL is an improvement over fine-tuning for dialing in domain/task specificity. Fine tuning is slow, expensive, and improves specialization at the cost of generalizability. On the other hand, an efficient ICL pipeline is quick and cheap to set up, boosts performance on specialized tasks without sacrificing on general ones, and can be iterated quickly as the project context/use case shifts. If any builders are still choosing fine-tuning over ICL, let's chat. I'd love to learn more!
Wyatt Marshall’s Post
More Relevant Posts
-
Versatile Data Scientist with 5+ years in financial projects and 6+ years specializing in economic analysis. Committed to uncovering actionable intelligence for informed decision-making and business success.
26 principles of prompting LLMs
This paper with 26 principles of prompting LLMs is essential reading for most people on this site across fields and roles as prompt engineering is for everyone. There is a corresponding GitHub project linked in the paper.
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
arxiv.org
To view or add a comment, sign in
-
This paper with 26 principles of prompting LLMs is essential reading for most people on this site across fields and roles as prompt engineering is for everyone. There is a corresponding GitHub project linked in the paper.
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
arxiv.org
To view or add a comment, sign in
-
Interesting read. I extend the question to my readers: Is prompt engineering going to become the dominant way that people interact with LLMs, or will it be hidden behind user interfaces acting as a proxy?
This paper with 26 principles of prompting LLMs is essential reading for most people on this site across fields and roles as prompt engineering is for everyone. There is a corresponding GitHub project linked in the paper.
Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4
arxiv.org
To view or add a comment, sign in
-
Chief Data Officer - Chief Technology Officer - Chief Information Officer - Software Engineering - Software Development - Artificial Intelligence
Would you like to know who to to adapt pretrained LLMs to specialized domains? Retrieval-Augmented Finetuning might be the answer for you. Last week Tianjun Zhang and Shishir Patil published this technique. The approach they found, outperforms Supervised Finetuning and RAG combined. In RAFT, given a question, and a set of retrieved documents, we train the model to ignore those documents that don't help in answering the question, which we call, distractor documents. RAFT accomplishes this by citing verbatim the right sequence from the relevant document that would help answer the question. This coupled with RAFT's chain-of-thought-style response helps improve the model's ability to reason. You can fin the article here: https://lnkd.in/dVDwXnW3 And the code they used to generate the dataset: https://lnkd.in/dPgkStET
RAFT: Adapting Language Model to Domain Specific RAG
arxiv.org
To view or add a comment, sign in
-
Writing a stupid paper, which involves giving a fancy nomenclature to a process that everyone practices is what finance folks used to do. Sad science is coming down to that as well. Models perform better in the prompts they are trained against so using RAG to get prompts and then train model is like 101 stuff and at best merits a blog or reddit post. A paper with a new name for an old process is nothing but tech pimping.
Interesting paper by Shishir Patil proposing fine tuning models for better performance with RAG. TLDR: 1) fine tuning is similar to "closed book" exam as you have to memorize everything 2) RAG is similar to "open book" exam that you technically dont have to memorize but reason on spot. 3) Proposes RAFT which is fine tuning the LLMs with the context documents so they can better reason with RAG This reminds me of the open book exams at IIT in which students on both sides of the bell curve did well. The bright students studied the books while the jugadoos hoarded library books and literally pattern matched questions with the key words in the appendices and still did well. https://lnkd.in/gE2xG-xy https://lnkd.in/grabPrXe
RAFT: A new way to teach LLMs to be better at RAG
techcommunity.microsoft.com
To view or add a comment, sign in
-
🚀 Check out my latest YouTube video on Text Embeddings! Whether you're a #DataScience enthusiast or a #MachineLearning practitioner, this tutorial is perfect for anyone keen on exploring #NaturalLanguageProcessing through Google Cloud's Vertex AI. In this video, you'll learn the basics of Text Embeddings and its crucial role in NLP projects. I'll also show you how to generate embeddings with the "textembedding-gecko@001" model on Vertex AI, visualize text embeddings in 2D using Principal Component Analysis (PCA), and measure vectors with cosine similarity. But that's just the start of our journey through an end-to-end real-world application! In upcoming videos, I'll be breaking down fundamental concepts as we extract business reviews directly from Google Maps. Our next steps involve cleaning and analyzing these reviews and creating a question-answering system utilizing vector databases and retrieval-augmented generation techniques. Stay tuned for more videos that tackle real-world problems and datasets, offering hands-on experience and insights. Watch the full video here: https://lnkd.in/eWn7hS9Z thanks for inspiring videos on #deeplearningai platform. inspiration: https://lnkd.in/ewhZUjQ3 #VertexAI #GoogleCloudPlatform #TextEmbeddings #CloudComputing #Tutorial #pca #nlp
Understanding Text Embeddings with PCA for Beginners
https://www.youtube.com/
To view or add a comment, sign in
-
Adaptive RAG - Learning to Adapt Retrieval-Augmented LLMs through Question Complexity 🤔 Problem : 📌 In a real world scenario, not all the questions posted to a LLM would be of same complexity. 📌 Using a static RAG technique, would not generalise well to all questions. For eg a multi-step RAG technique would be an overhead for a simple question 💡 Solution : 📌 The authors of this paper from Korea Advanced Institute of Science and Technology have come up with an approach to train a classifier that classifies the complexity of the question 📌 Based on the complexity identified, the question is answered either using no-retrieval (directly answered by LLM) or using a single-step retrieval or multi-step retrieval based RAG solution. 📌 The approach thus helps to dynamically identify the relevant RAG approach required to answer a question based on its complexity. 📊 Results : 📌 Results show that the approach outperformed no-retrieval and single-step retrieval approaches on all the datasets considered. 📌 Results also showed that the approach was more efficient compared to multi-step retrieval based RAG approach in terms of time and number of steps needed for generating the response without compromising much on the effectiveness measured with F1-score, accuracy and EM. In this blog, I have tried to summarise my understanding and key observations about this paper. Blog : https://lnkd.in/gX_YR5NV Paper : https://lnkd.in/gs8nFd8D Annotated Paper : https://lnkd.in/gyZkWhkC #RAG #Adaptive-RAG #LLM
E21 : Adaptive RAG
medium.com
To view or add a comment, sign in
-
It takes both storytelling talent and deep subject understanding to write a useful professional paper in the beautiful metaphorical style (‘If “prompt engineering” doesn't meet the criteria of spell-casting, it's hard to say what does.’) ;) Bravo, Prof. Dr. Ivan Yamshchikov ! And yes, it’s a highly recommended weekend reading for whoever is ready to start tuning LLMs for specific business tasks instead of using general purpose solutions. https://lnkd.in/e2YcZ4PG
Brewing a Domain-Specific LLM Potion - KDnuggets
kdnuggets.com
To view or add a comment, sign in
-
Text-guided image editing is a crucial tool in both personal and professional settings, but current methods often require manual adjustments and can be noisy. #imageediting #MagicBrush To address this issue, a new dataset called MagicBrush has been introduced, containing over 10K manually annotated triples for instruction-guided real image editing. This dataset supports the training of large-scale text-guided image editing models. By fine-tuning InstructPix2Pix on MagicBrush, the new model is able to produce significantly improved images based on human evaluation. Extensive experiments demonstrate the challenging nature of the dataset and the gap between current baselines and real-world editing needs. 📷🎨 Title: MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing Code: https://lnkd.in/eZMt_HYs Graph: https://lnkd.in/e_M56TPq Paper: https://lnkd.in/eiy_Fqby ⭐️: 211 Check out other GitHub repos on the NLP Index: https://lnkd.in/geg6zPm
GitHub - OSU-NLP-Group/MagicBrush: Dataset, code and models for the paper "MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing".
github.com
To view or add a comment, sign in
-
Super interesting talk by Stef Heyenrath about extracting text from a pdf, vectorizing it and then answering questions based on those text fragments on a special designed prompt. https://lnkd.in/eDHX8Gz9 mstack .net Assemble, thanks for the great day!
Implementing a question-answering system for PDF documents using ChatGPT and Redis - mstack
https://mstack.nl
To view or add a comment, sign in