Sign in to view Ludwig’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
United States
Contact Info
Sign in to view Ludwig’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
373 followers
288 connections
Sign in to view Ludwig’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Ludwig
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Ludwig
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Sign in to view Ludwig’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
About
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Activity
Sign in to view Ludwig’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Experience & Education
-
Anthropic
****** ** ********* *****
-
**** *. ***** ****** ** ******** ******* & ***********
********* ********* (** *****)
-
******** **********
****** ********* *********
-
************* ********* ** **********
****** ** ********** (***) ******** *******
-
-
************* ********* ** **********
****** ** ******* (**) ******** *******
-
View Ludwig’s full experience
See their title, tenure and more.
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
View Ludwig’s full profile
Sign in
Stay updated on your professional world
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Other similar profiles
-
Mandeep Waraich
San Francisco Bay AreaConnect -
Christian Rupprecht
Associate Professor
Greater Oxford AreaConnect -
Hannaneh Hajishirzi
Greater Seattle AreaConnect -
Agneet Chatterjee
CS PhD @ Arizona State University | Vision and Language
Tempe, AZConnect -
Nicole Belanger
The future is bright...and Yellow!
San Francisco, CAConnect -
Vasudev Lal
Portland, Oregon Metropolitan AreaConnect -
Stefan Roth
Frankfurt Rhine-Main Metropolitan AreaConnect -
Marie-Claude (MC) Lavoie
Montreal, QCConnect -
Xiaolong Wang
San Diego, CAConnect -
Diego Martí Monsó
R&D @ Yellow (we're hiring!) | TUM | CDTM
MunichConnect -
Cameron Tuckerman-Lee
San Francisco Bay AreaConnect -
Guanya Shi
Assistant Professor at the Robotics Institute at Carnegie Mellon University. Lead the LeCAR (Learning and Control for Agile Robotics) Lab.
Pittsburgh, PAConnect -
Been Kim
Seattle, WAConnect -
Zhe Gan
Seattle, WAConnect -
Ali Yekkehkhany
Postdoctoral Researcher at University of California, Berkeley
United StatesConnect -
Arun Mannodi Kanakkithodi
Assistant Professor, School of Materials Engineering, Purdue University
West Lafayette, INConnect -
Beidi Chen
Berkeley, CAConnect -
Gözde Barim
San Francisco Bay AreaConnect -
Mohammad Javad Amiri
Assistant Professor at Stony Brook University
Stony Brook, NYConnect -
Tara Parhizkar, Ph.D., P.E.
Los Angeles Metropolitan AreaConnect
Explore more posts
-
Dr. Aditya Raj
A recent breakthrough titled "Matrix Multiplication-Free LLMs" demonstrates a huge advancement in the area of Large Language Models (LLMs) by reducing computational costs. The authors have eliminated MatMul operations from LLMs, claiming to 10 times reduction in memory usage and a 25.6% increase in training speed, all while maintaining strong performance at billion-parameter scales. Paper link: https://lnkd.in/ggph8qXc #AI #machinelearning #deeplearning #LLMs
16
-
Bing Dong
We are thrilled to share our latest paper, "Modularized Neural Network Incorporating Physical Priors for Future Building Energy Modeling," published in Patterns under Cell Press [Read the full text here](https://lnkd.in/eKyzK8Hr). Kudos to my great Ph.D. student ZiXin Jiang. The proposed model is designed for load prediction, dynamic modeling, building retrofitting, and energy optimization. Here are the four key innovations: 1) Heat Balance-Inspired Modularization: We incorporated physical knowledge by modularizing the model structures to create a heat balance framework. Specifically, we developed distinct neural network modules to estimate each unique heat transfer term of the dynamic building system. 2) State-Space-Inspired Encoder-Decoder Structure: An encoder is designed to extract historical information, a current cell measures data from the current time step, and a decoder predicts system responses based on future system inputs and disturbances. 3) Physically Consistent Model Constraints: We introduce physical consistency constraints to ensure the model responds appropriately to given inputs. For example, the conduction heat flux through a wall decreases as the R-value increases, and indoor air temperature decreases with an increasing HVAC cooling load. 4) Lego Brick-Inspired Modular Design: We connect different modules based on physical typology, allowing for multiple-building applications through model sharing and inheritance.The proposed model has been validated on three real-world datasets and two synthetic datasets. We believe this model provides a scalable solution for future multi-scale, multi-component, and multi-task building energy modeling. Code available: https://lnkd.in/ewVrq7rb Thanks for the funding support by #NSF. Hopefully this work will shed light for future AI/Data-driven building energy modeling. Amir Roth,Tianzhen Hong,SyracuseCoE: NYS's Center of Excellence in Environmental and Energy Systems,#Syracuse,
97
6 Comments -
Flaviu Cipcigan
I'll be at ICML in Vienna Monday-Sunday next week. Excited to meet other researchers (DM me), present a poster with Dominic Phillips at AI for Science, and host a packed social night between The AI Alliance and AI for Science. Our paper is titled MetaGFN: Exploring Distant Modes with Adapted Metadynamics for Continuous GFlowNets. We adapt metadynamics for arbitrary black-box reward functions on continuous domains. MetaGFN uses adapted metadynamics for off-policy exploration in continuous GFlowNets. For a 1D potential with distant modes and the alanine dipeptide, MetaGFN has faster convergence to the target distribution and discovers more distant reward modes than previous GFlowNet exploration strategies.
32
3 Comments -
Edward Y. Chang
New GAI Textbook Announcement (The textbook is being used by Stanford CS372: AI for precision medicine and psychological disorders) The past two years have been a significant period in the field of AI, during which we have observed substantial strides in generative AI technologies. Having dedicated over 5,000 hours to research and analysis, I have compiled this book to share a collection of findings and hypotheses regarding Large Language Models (LLMs)—detailing their operational mechanisms, and strategies to mitigate biases and reduce instances of hallucination. The central premise is to enhance context awareness and define human objectives with greater precision. By doing so, LLMs can more effectively mirror human linguistic behaviors, utilizing context-specific linguistic features to fulfill intended goals. SocraSynth: Socratic Synthesis with Multiple Large Language Models. https://lnkd.in/giC4474p
25
-
Mitodru Niyogi
I’m happy to share my recent work in Mathematical LLM in collaboration with Prof. Arnab Bhattacharya We found that we don’t need giant LLMs to be strong at mathematical reasoning. Our 208M million parameters model outperformed LLaMa-1 33B, LLaMA-2 13B, Vicuna 13B, PaLM 62B, Falcon 40B, Minerva 8B, and LLEMMA 7B on GSM8k Benchmark despite being smaller by large oder of magnitude in size! Most interestingingly, we trained only for 146 hours of A100 whereas math specialised LLEMMA 7B (ICLR’24) was trained for 23,000 A100 training! Trained in 🇮🇳 not large but smart 😎 “Large doesn’t mean strong or smart” #genai #math #llms #ai #reasoning #india #artificialimtelligence #languagemodels
27
4 Comments -
Hai Huang
Stole a few points from Prof Christopher Manning, who talked about LLMs and language modeling in general in the latest TWIML AI Podcast: 📌 Humans acquire language skills in a way very different from LLMs. We need millions of words, compared to billions or even trillions of tokens for LLMs. LLM researchers may want to investigate and learn from how humans acquire language skills. 📌 LLMs cannot reason. However, there are other deep learning models, such as AlphaGo, that can. LLM researchers may want to look into how to integrate that type of reasoning/searching/planning capability into LLMs. 📌 LLMs' world models should enable search and discovery. Although Prof. Manning didn’t call this out explicitly, my understanding is more similar to a knowledge graph type of structure. 📌 Next-gen LLM idea: a soft form of locality and hierarchy. Transformers attend every token to every other token, which is very inefficient, while human language can be modeled by n-grams most of the time. #artificialintelligence #machinelearning #deeplearning https://lnkd.in/euzwMQ6p
84
18 Comments -
Hossein Rahmani
4 papers accepted for presentation at ECCV 2024, to be held in Milan, Italy from September 29 to October 4, 2024. Thanks to postdocs, PhD students and our collaborators! 1- Xinyu Yang, Hossein Rahmani, Sue Black, Bryan Williams, "Weakly Supervised Co-training with Swapping Assignments for Semantic Segmentation". Code and paper are available at https://lnkd.in/eCXrAsDr. The proposed method, CoSA, is the first single-stage weakly supervised semantic segmentation approach to outperform all existing multi-stage methods including those with additional supervision, surpassing existing baselines by a substantial margin. 2- Feixiang Zhou, Bryan Williams, Hossein Rahmani, "Towards Adaptive Pseudo-label Learning for Semi-Supervised Temporal Action Localization". Code and paper will be available soon 3- Zhengbo Zhang, Li Xu, Duo Peng, Hossein Rahmani, Jun Liu, "Diff-Tracker: Text-to-Image Diffusion Models are Unsupervised Trackers". Code and paper will be available soon. 4- Xiaofei Hui, Qian Wu, Hossein Rahmani, Jun Liu, "Class-Agnostic Object Counting with Text-to-Image Diffusion Model". Code and paper will be available soon. #ECCV2024
128
2 Comments -
Hazem Abdelazim
This is very interesting and useful research . for Scientists and practitioners working on Arabic RAG pipelines , can make good use of this work. From my experience , Semantic embeddings in many cases fails to capture some important fields , for example it may miss المادة الخامسة واربعون within a legal Arabic context which gets dissolved in the embedding vector. Augmenting the RAG context with metadata can greatly enhance the performance . This technology (GliNER) provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios. I tested it on Arabic and it works pretty fine 👇
17
1 Comment -
Flaviu Cipcigan
Just read Farquhar et al's Nature paper on detecting hallucinations in large language models using semantic entropy. Interesting result on quantifying uncertainty in LLMs. When you ask an LLM a question, its answers may look correct, but be made up - a confabulation. For example, the LLM may answer incorrectly that the STARD10 protein is a "negative regulator of the mTOR pathway" instead of its true role as a lipid transfer protein. When an answer is confabulated, multiple subsequent generations will have different meanings. For example, a second answer of the LLM may incorrectly say that STARD10 is a "negative regulator of the meiotic recombination process". The variability in meaning can be quantified using semantic entropy, which measures variability over multiple generated answers. Higher entropy suggests greater variability and potential confabulation. The LLM can then refuse questions with high semantic entropy. To calculate semantic entropy, you generate 5-10 answers. You then cluster the answers based on semantic meaning using bi-directional entailment. Two answers are part of the same cluster if they entail - or imply - each other (e.g. "Joe buys a red apple" implies "A red apple was bought by Joe" and vice-versa). The entailment is calculated using another, possibly less powerful LLM, so it's an interesting applications of using language models to correct other language models. After clustering, the probabilities of each cluster are calculated from the output probabilities of the LLM, and then the resulting entropy. Thresholding the entropy can classify a question as likely to produce confabulations with an AUROC of around 0.75. Link to paper below.
63
2 Comments -
Ajay S.
In computational linguistics, researchers are focused on enhancing language models' ability to extract precise information from extensive textual data. Recent studies introduce "retrieval heads," specialized attention mechanisms that selectively focus on critical segments of large texts, improving targeted data retrieval. These heads, integrated into transformer-based models like LLaMA and Yi, significantly enhance accuracy and efficiency, particularly in handling long-context scenarios. The methodology involves thorough experiments across various models, including the Needle-in-a-Haystack test, which embeds specific information to measure precision. Results show models with active retrieval heads outperform those without, with accuracy dropping notably when these heads are masked. The empirical evidence underscores the effectiveness of retrieval heads in improving information retrieval precision and reliability. This research deepens our understanding of attention mechanisms in large-scale text processing, paving the way for more efficient and accurate language models. https://lnkd.in/dNNPvMXw https://lnkd.in/ddbHX6EA Innovation Hacks AI #aiinnovation #generativeai
1
-
Gregory Mermoud
Very insightful work by Anthropic’s interpretability team. And an amazing paper, with outstanding writing and figures. The idea is very simple: interpret LLMs by leveraging sparse autoencoders as surrogate models of the MLP of transformer blocks, which allow one to disambiguate the superposition of features captured by a single neuron. A simple idea, but a very careful and complex execution, as it is often the case in our line of work. The paper goes into many details and provide a large array of insights, although the gist of the implementation remains obfuscated due to the closed source nature of Claude. Too bad, because this is the kind of work that we need to better understand and eventually trust LLMs. This is demonstrated by the authors in the section ‘Influence on Behavior’, where they show that clamping some features to either high or low value during inference is “remarkably effective at modifying model outputs in specific, interpretable ways”. Hopefully this kind of work is going to be replicated and generalized to open-weights models, such that we have new ways to steer their behavior. https://lnkd.in/eVym7f_f #interpretability #xai #explainableai #steerableai #anthropic #claude #anthropic
3
-
Kaize Ding
Even though LLM is commonly considered as a powerful data augmentation tool, naively prompting LLM with arbitrary augmentation instructions cannot really achieve very good improvements as you expected. In this work, we explore how to empower LLM to automatically generate and select suitable data augmentation methods for specific downstream tasks and show that our Self-LLMDA can well generalize to different tasks.
58
2 Comments -
Filip Kilibarda
I am excited to share an update on a small project of mine! https://lnkd.in/e28pz7t7 I have been working on creating a flexible Variational AutoEncoder (VAE) that is simple to use and can be easily reconfigured for various use cases. This includes feature reduction, probability capture, or generative models such as diffusion probabilistic models. Additionally, this VAE also offers some quality-of-life features such as non-balanced multi-GPU support and batch estimators. And there's more to come! Stay tuned for some helpful how-to guides and Jupyter notebooks in the pipeline. #AI #DeepLearning #VAE #GenerativeModels #AI #NeuralNetworks
8
2 Comments -
Alexandra Neagu
I recently completed a detailed review of the paper "ALBERT: A Lite BERT for Self-supervised Learning of Language Representations" by Lan et al.! This fascinating study presents an innovative approach to language modelling, emphasizing efficiency and effectiveness through a lighter, more parameter-efficient architecture compared to traditional models like BERT. In my review, I delved into the strengths and weaknesses of the paper, discussing its impact and potential areas for improvement. This work has made a significant contribution to the field of natural language processing and has already influenced many advancements in the area. You can read my full review of the paper here: https://lnkd.in/ehxm5jXt #NLP #MachineLearning #ArtificialIntelligence #AI #Research #TechReview #DeepLearning
16
1 Comment
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Ludwig Schmidt
-
Ludwig Schmidt
Director Corporate Governance bei Vogel Communications Group GmbH & Co. KG
Würzburg -
Ludwig Schmidt
Geschäftsführer bei Ludwig Schmidt -Transporte-
Frankfurt Rhine-Main Metropolitan Area -
Ludwig Schmidt
Business Development Marketing & Sales
Salzburg -
Ludwig Schmidt
Owner at SCHMIDT LAW FIRM
Lake Oswego, OR
51 others named Ludwig Schmidt are on LinkedIn
See others named Ludwig Schmidt