Alejandro Luis Figueroa Cevallos’ Post

Amazon MRA | Full Stack Developer | Electronic Engineer

2mo

It is certainly so Andrew Ng, LLMs trained on their own responses through agentic workflows could potentially improve output quality, resembling human learning processes. This approach, though costly, could enhance LLM training. #AI #GenAI #innovation #technology

Andrew Ng

Founder of DeepLearning.AI; Managing General Partner of AI Fund; Founder and CEO of Landing AI

2mo

Inexpensive token generation and agentic workflows for LLMs open up new possibilities for training LLMs on synthetic data. Pretraining an LLM on its own directly generated responses to prompts doesn't help. But if an agentic workflow implemented with the LLM results in higher quality output than the LLM can generate directly, then training on that output becomes potentially useful. Just as humans can learn from their own thinking, perhaps LLMs can, too. Imagine a math student learning to write mathematical proofs. By solving a few problems — even without external input — they can reflect on what works and learn to generate better proofs. LLM training involves (i) pretraining (learning from unlabeled text data to predict the next work) followed by (ii) instruction fine-tuning (learning to follow instructions) and (iii) RLHF/DPO to align to human values. Step (i) requires orders of magnitude more data than the others. For example, Llama 3 was pretrained on over 15 trillion tokens. LLM developers are still hungry for more data. Where can we get more text to train on? Many developers train smaller models on the output of larger models, so a smaller model learns to mimic a larger model’s behavior on a particular task. But an LLM can’t learn much by training on data it generated directly. Indeed, training a model repeatedly on the output of an earlier version of itself can result in model collapse. But, an LLM wrapped in an agentic workflow can produce higher-quality output than it can generate directly. This output might be useful as pretraining data. Efforts like these have precedents: - When using reinforcement learning to play a game like chess, a model might learn a function that evaluates board positions. If we apply game tree search along with a low-accuracy evaluation function, the model can come up with more accurate evaluations. Then we can train that evaluation function to mimic these more accurate values. - During alignment, Anthropic’s constitutional AI uses RLAIF (RL from AI Feedback) to judge LLM output quality, substituting feedback generated by an AI model for human feedback. A significant barrier to using agentic workflows to produce LLM training data is the cost of generating tokens. Say we want to generate 1 trillion tokens to extend a pre-existing dataset. At current retail prices, 1 trillion tokens from GPT-4-turbo ($30 per million output tokens), Claude 3 Opus ($75), Gemini 1.5 Pro ($21), and Llama-3-70B on Groq ($0.79) would cost, respectively, $30M, $75M, $21M and $790K. Of course, an agentic workflow would require generating more than one token per final output token. But budgets for training cutting-edge LLMs easily surpass $100M, so spending a few million dollars more for data to boost performance is feasible. That’s why agentic workflows might opening up new opportunities for high-quality synthetic data generation. [Original text: https://lnkd.in/gFF2AsZ9 ]

Apple's Tiny LLMs, Amazon Rethinks Cashier-Free Stores, and more

deeplearning.ai

To view or add a comment, sign in

More Relevant Posts

Alejandro Luis Figueroa Cevallos

Amazon MRA | Full Stack Developer | Electronic Engineer
1w Edited
Report this post
🚀 𝗨𝗻𝘃𝗲𝗶𝗹𝗶𝗻𝗴 𝗠𝗲𝘁𝗮 𝟯𝗗 𝗚𝗲𝗻! Meta 3D Gen is a cutting-edge pipeline for rapid text-to-3D asset generation, producing high-fidelity and quality assets in under a minute. It supports physically-based rendering (PBR) for real-world applications and combines Meta 3D AssetGen and Meta 3D TextureGen to enhance 3D shapes and textures using text prompts. This dual-stage method outperforms industry benchmarks, offering fast, high-quality, and customizable 3D content creation. 𝙎𝙥𝙚𝙘𝙞𝙛𝙞𝙘 𝙖𝙣𝙙 𝘿𝙞𝙛𝙛𝙚𝙧𝙚𝙣𝙩𝙞𝙖𝙗𝙡𝙚 𝙁𝙚𝙖𝙩𝙪𝙧𝙚𝙨: 1. 𝘍𝘢𝘴𝘵 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘰𝘯 𝘛𝘪𝘮𝘦: Creates high-fidelity 3D assets in under a minute. 2. 𝘗𝘩𝘺𝘴𝘪𝘤𝘢𝘭𝘭𝘺-𝘉𝘢𝘴𝘦𝘥 𝘙𝘦𝘯𝘥𝘦𝘳𝘪𝘯𝘨 (𝘗𝘉𝘙): Supports relighting in real-world applications. 3. 𝘛𝘸𝘰-𝘚𝘵𝘢𝘨𝘦 𝘔𝘦𝘵𝘩𝘰𝘥: - Stage I: Uses Meta 3D AssetGen for initial 3D asset generation. - Stage II: Uses Meta 3D TextureGen for high-quality texture refinement and retexturing. 4. 𝘏𝘪𝘨𝘩 𝘗𝘳𝘰𝘮𝘱𝘵 𝘍𝘪𝘥𝘦𝘭𝘪𝘵𝘺: Maintains high accuracy to textual prompts. 5. 𝘝𝘦𝘳𝘴𝘢𝘵𝘪𝘭𝘦 𝘜𝘴𝘦: Applicable for creating and retexturing both generated and artist-created 3D meshes. 𝘾𝙤𝙢𝙥𝙖𝙧𝙞𝙨𝙤𝙣 𝙬𝙞𝙩𝙝 𝘾𝙤𝙢𝙥𝙚𝙩𝙞𝙩𝙤𝙧𝙨: 1. 𝘊𝘚𝘔 𝘊𝘶𝘣𝘦 2.0: - Generation Time: 15 minutes to 1 hour. - Limitations: Lacks PBR support and clean topology. 2. 𝘛𝘳𝘪𝘱𝘰3𝘋: - Generation Time: 30 seconds to 3 minutes. - Limitations: Does not support PBR materials. 3. 𝘙𝘰𝘥𝘪𝘯 𝘎𝘦𝘯-1 𝘝0.5: - Generation Time: Varies from 3 to 30 minutes. - Limitations: Sometimes fails to handle complex geometry. 4. ��𝘦𝘴𝘩𝘺 𝘷3: - Generation Time: 1 to 10 minutes. - Limitations: Prone to texture artefacts and inpainting issues. 5. 𝘛𝘩𝘪𝘳𝘥-𝘗𝘢𝘳𝘵𝘺 𝘛23𝘋 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘰𝘳: - Generation Time: 10 seconds to 10 minutes. - Limitations: Lacks clean topology and detailed textures. 𝙈𝙚𝙩𝙖 𝟯𝘿 𝙂𝙚𝙣 𝙊𝙪𝙩𝙥𝙚𝙧𝙛𝙤𝙧𝙢𝙨: - 𝘎𝘦𝘯𝘦𝘳𝘢𝘵𝘪𝘰𝘯 𝘛𝘪𝘮𝘦: 30 seconds for Stage I, 1 minute for Stages I and II. - 𝘜𝘯𝘪𝘲𝘶𝘦 𝘍𝘦𝘢𝘵𝘶𝘳𝘦𝘴: Superior prompt fidelity, high-quality textures, PBR support, and versatile retexturing capabilities. 𝙑𝙞𝙙𝙚𝙤 𝘾𝙧𝙚𝙙𝙞𝙩: Maginative #3DGen #MetaAI #TextTo3D #PBR #Innovation #AIResearch #GenerativeAI

Meta 3D Gen

https://www.youtube.com/
Like Comment
To view or add a comment, sign in
Alejandro Luis Figueroa Cevallos

Amazon MRA | Full Stack Developer | Electronic Engineer
1w
Report this post
Raphaël MANSUY, indeed, the explicit memory mechanism, sparsification techniques, and integration into the self-attention process contribute to its superior capabilities compared to contemporary models such as LongMem, MemGPT, and RAG. The Memory3 architecture's approach to memory retrieval and integration allows it to outperform these models by ensuring frequent, relevant knowledge retrieval and maintaining high accuracy over extended contexts. #ArtificialIntelligence #MachineLearning #Memory3 #Innovation #Technology #AIResearch
Raphaël MANSUY

Data Engineering | DataScience | AI & Innovation | Author | Follow me for deep dives on AI & data-engineering
1w

Memory3: A New Approach to Language Modeling ... Memory3 is a novel architecture for LLMs that introduces an explicit memory mechanism to improve efficiency and performance. Here are the key points: 1. Concept of Explicit Memory: - Memory3 introduces a new memory format called "explicit memory" that sits between model parameters and plain text retrieval. - This explicit memory is designed to store knowledge more efficiently than traditional model parameters. 2. Memory Hierarchy: - Memory3 establishes a memory hierarchy: plain text → explicit memory → model parameters. - As you move up this hierarchy, the cost of storing information increases, while the cost of accessing decreases. 3. Knowledge Externalization: - The model externalizes a significant portion of its knowledge into explicit memories, rather than storing everything in model parameters. - This allows for a smaller backbone model that focuses on abstract knowledge and frequently used specific knowledge. 4. Explicit Memory Format: - Each explicit memory is a compressed, sparse representation of knowledge derived from attention key-values. - This format is much more compact than full text, allowing storage of larger knowledge bases. 5. Memory Sparsification: - To make storage tractable, Memory3 uses aggressive sparsification techniques: a. Only the first half of attention layers are used as "memory layers". b. Only 8 out of 128 tokens are selected per attention head. c. Grouped Query Attention (GQA) is used to reduce the number of key-value heads. 6. Inference Process: - Before inference, reference texts are converted to explicit memories and stored. - During inference: a. The model processes input in chunks of 64 tokens. b. For each chunk, it retrieves 5 relevant explicit memories. c. These memories are integrated into the self-attention computation. d. This process repeats for each 64-token chunk of input. 7. Two-Stage Pretraining: - Warmup Stage: The model is trained without explicit memories, similar to traditional LLM pretraining. - Continual Train Stage: Explicit memories are introduced, and the model learns to use them effectively. 8. Advantages: - Allows for a smaller model size while maintaining or improving performance. - Enables processing of very long contexts efficiently. - Improves factuality and reduces hallucination by providing more explicit knowledge. 9. Differences from RAG: - Memory3 integrates explicit memories directly into the model's attention layers, unlike RAG which typically encodes retrieved passages separately. - Memory3 retrieves memories more frequently during generation (every 64 tokens) compared to RAG's typical one-time retrieval. - The explicit memory format is more compact and efficient than RAG's full text passages. 10. Results: - A 2.4B parameter Memory3 model outperformed larger models and RAG models on various benchmarks.
Like Comment
To view or add a comment, sign in
Alejandro Luis Figueroa Cevallos

Amazon MRA | Full Stack Developer | Electronic Engineer
1w
Report this post
𝗢'𝗥𝗲𝗶𝗹𝗹𝘆 𝗔𝘂𝘁𝗼 𝗣𝗮𝗿𝘁𝘀 𝗛𝗮𝘀 𝘁𝗵𝗲 𝗙𝗹𝘂𝘅 𝗖𝗮𝗽𝗮𝗰𝗶𝘁𝗼𝗿! 𝗕𝘂𝗶𝗹𝗱 𝗮 𝗧𝗶𝗺𝗲-𝗧𝗿𝗮𝘃𝗲𝗹𝗶𝗻𝗴 𝗖𝗮𝗿 𝗧𝗼𝗱𝗮𝘆! Are you ready to turn your DeLorean into a time machine? Here’s a guide on constructing your very own time-traveling car, powered by a flux capacitor! The flux capacitor, invented by the legendary Doc Brown, is the core component that makes time travel possible. Here’s a breakdown of what you need and how it works: 𝙆𝙚𝙮 𝘾𝙤𝙢𝙥𝙤𝙣𝙚𝙣𝙩𝙨: 1. 𝘍𝘭𝘶𝘹 𝘊𝘢𝘱𝘢𝘤𝘪𝘵𝘰𝘳: This essential device consists of a Y-shaped configuration of tubes and flashing lights. It acts as a power reservoir for the immense energy needed for time travel. O'Reilly Auto Parts even has a flux capacitor listed for fun! 2. 𝘋𝘦𝘓𝘰𝘳𝘦𝘢𝘯: While any car might theoretically work, the DeLorean’s stainless steel body aids in flux dispersal, ensuring smooth passage through the space-time continuum. 3. 𝘗𝘰𝘸𝘦𝘳 𝘚𝘰𝘶𝘳𝘤𝘦: The flux capacitor requires 1.21 gigawatts of power. In the movies, a bolt of lightning provided this power, but in real life, you'd need a similarly powerful energy source. 𝘾𝙤𝙣𝙨𝙩𝙧𝙪𝙘𝙩𝙞𝙤𝙣 𝙎𝙩𝙚𝙥𝙨: 1. 𝘐𝘯𝘴𝘵𝘢𝘭𝘭 𝘵𝘩𝘦 𝘍𝘭𝘶𝘹 𝘊𝘢𝘱𝘢𝘤𝘪𝘵𝘰𝘳: Position the flux capacitor inside the car, typically in the rear compartment. Ensure it’s securely connected to the car’s electrical system. 2. 𝘛𝘪𝘮𝘦 𝘊𝘪𝘳𝘤𝘶𝘪𝘵𝘴: Integrate a digital input and display system to set your desired time destinations. This system should display destination time, present time, and last time departed. 3. 𝘗𝘰𝘸𝘦𝘳 𝘚𝘦𝘵𝘶𝘱: Connect your flux capacitor to a robust power source. In the movie, plutonium was used to generate the necessary energy, but for practical purposes, you might explore advanced batteries or alternative high-energy solutions. 𝙃𝙤𝙬 𝙄𝙩 𝙒𝙤𝙧𝙠𝙨: When the DeLorean hits 88 mph, the flux capacitor channels 1.21 gigawatts of power through three glowing rods, creating a unidirectional flow of microwaves that converge at the center, enabling the car to transition through a wormhole to your set destination time. 𝙉𝙤𝙩𝙚: While constructing a time machine as seen in "Back to the Future" remains fictional, the flux capacitor concept has inspired real-world scientific innovations, particularly in enhancing quantum computing and radar technology. 𝙇𝙚𝙩’𝙨 𝙩𝙧𝙖𝙫𝙚𝙡 𝙩𝙝𝙧𝙤𝙪𝙜𝙝 𝙩𝙞𝙢𝙚 𝙩𝙤𝙜𝙚𝙩𝙝𝙚𝙧! #Innovation #TimeTravel #FluxCapacitor #DeLorean #QuantumPhysics #FutureTech #OREILLYAutoParts

Flux Capacitor - Great Scott! | O'Reilly Auto Parts

oreillyauto.com
Like Comment
To view or add a comment, sign in
Alejandro Luis Figueroa Cevallos

Amazon MRA | Full Stack Developer | Electronic Engineer
1w
Report this post
JJ Delgado🤙, The video of a grandmother breaking a dam serves as a powerful reminder of the potential impact of individual actions. Whether through environmental efforts, social advocacy, or personal development, small actions can catalyze significant changes, demonstrating that everyone has the power to make a difference. #individualimpact #environmentalaction #socialchange #butterflyeffect #rippleeffect #inspiration #smallactionsbigimpact #GretaThunberg #WangariMaathai

JJ Delgado🤙

9-figure Digital Businesses Maker based on technology (Web2, Web3, AI, and noCode) | General Manager MOVE Estrella Galicia Digital & exAmazon
1w

Never underestimate the impact of small actions 🚀 This grandmother transformed the entire ecosystem, linking the river to the ocean. Often, it's not about the effort you put in, but the results you achieve. In our careers, small, consistent efforts can lead to monumental changes. Learning and connecting dots can unleash incredible power. Each connection, each insight, and each innovative idea can drive tremendous progress. Remember, you possess immense potential. Don't wait to unlock it. Practice continuous learning with this exceptional weekly content designed to leverage AI in business, productivity, and digital marketing 👉 www.wildtools.ai/wildclub 𝗞𝗲𝗲𝗽 𝗶𝗻 𝘁𝗼𝘂𝗰𝗵 👇 https://lnkd.in/dEbzmJqN Struggling with stress, work-life balance, burnout, or finding your purpose? Make a change 👉https://lnkd.in/dWQyd-if #ProfessionalGrowth #Leadership #Innovation #CareerDevelopment #UnlockYourPotential
Like Comment
To view or add a comment, sign in

1,025 followers

2,058 Posts

View Profile Follow

Alejandro Luis Figueroa Cevallos’ Post

More Relevant Posts

Meta 3D Gen

https://www.youtube.com/

Explore topics