Sign in to view Anima’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
San Francisco Bay Area
Contact Info
Sign in to view Anima’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
188K followers
500+ connections
Sign in to view Anima’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Anima
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
View mutual connections with Anima
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Sign in to view Anima’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Activity
Sign in to view Anima’s full profile
Welcome back
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
or
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
-
Curious about boosting context length in Llama 3.1 by 16x? Our Mini-sequence Transformer (MST) offers insights! MST extends context length with no…
Curious about boosting context length in Llama 3.1 by 16x? Our Mini-sequence Transformer (MST) offers insights! MST extends context length with no…
Shared by Anima Anandkumar
-
Our workshop on efficient training of neural architectures at scale is happening now in Hall A, room A1! Join us to hear from our amazing speaker…
Our workshop on efficient training of neural architectures at scale is happening now in Hall A, room A1! Join us to hear from our amazing speaker…
Liked by Anima Anandkumar
-
Very impressive if true. " Lean verifies every proof step that LLM proposes and hence, we can completely remove any hallucination and guarantee 100%…
Very impressive if true. " Lean verifies every proof step that LLM proposes and hence, we can completely remove any hallucination and guarantee 100%…
Liked by Anima Anandkumar
View Anima’s full profile
Sign in
Stay updated on your professional world
By clicking Continue to join or sign in, you agree to LinkedIn’s User Agreement, Privacy Policy, and Cookie Policy.
New to LinkedIn? Join now
Other similar profiles
-
Aditya Grover
Los Angeles, CAConnect -
Lex Fridman
Cambridge, MAConnect -
Andrew Ng
Palo Alto, CAConnect -
Flora Salim
AustraliaConnect -
Cassie Kozyrkov
New York, NYConnect -
Sarah Tariq
Santa Clara, CAConnect -
Carly Taylor, M.Sc.
Los Angeles Metropolitan AreaConnect -
Anushri Dixit
Los Angeles, CAConnect -
Monica Agrawal
Cambridge, MAConnect -
Thomas Wolf
UtrechtConnect -
Bojan Tunguz, Ph.D.
Tampa, FLConnect -
Chip Huyen
San Francisco, CAConnect -
Allie K. Miller
New York, NYConnect -
Andrew Huberman
Stanford, CAConnect -
Sebastian Raschka, PhD
Madison, WIConnect -
Aleksa Gordić
LondonConnect -
Jim Fan
Stanford, CAConnect -
Deepti Raghavan
Palo Alto, CAConnect -
Dr. Joy Buolamwini
Greater BostonConnect -
Julien Chaumond
Greater Paris Metropolitan RegionConnect
Explore more posts
-
Mark Wilde
Joint work with Dhrumil Patel and Patrick Coles now published in Quantum - the open journal for quantum science: https://lnkd.in/g7vUGDjn Popular Summary: Many real-world problems in science and industry can be expressed as optimization problems, which involve finding the best solution while meeting specific constraints. Among these, a special class of optimization problems called semidefinite programming holds significance. They are widely used to model or approximate problems arising in various fields such as operations research, combinatorial optimization, control theory, and quantum information theory. For solving these programs, quantum algorithms have been proven to provide a quadratic speedup over classical algorithms. However, these quantum algorithms are not well-suited for current quantum devices, which are noisy and limited in their capabilities. In this work, we propose three quantum algorithms designed to run on these noisy devices. Our algorithms are hybrid quantum-classical algorithms that have a classical computer available for optimization, only calling a quantum computer for tasks that are not efficiently solvable by it. We rigorously analyze the performance of one of our algorithms, quantifying how rapidly it converges to the optimal value. Finally, to demonstrate their practicality, we numerically simulate our quantum algorithms for problems like MaxCut, a prominent graph theoretic problem. Our simulations showcase their effectiveness even in the presence of noise.
106
-
Alok Srivastava
We are excited to share our recent publication on Machine learning to Classify Alzheimer genes association using Naïve Bayes algorithm, co-authored by Sushrutha Raj and Anchal Vishnoi. This study combines text mining and machine learning to identify and prioritize candidate genes for Alzheimer's, classifying them into three weighted association classes. The classifier was trained on a meticulously curated gold standard data-set and validated using 10-fold cross-validation, ensuring consistency. The system uses text mining and a Bayesian algorithm to categorize PubMed abstracts and predict disease-gene associations. Achieving 87.33% accuracy and a confidence level of 90.10% ± 0.142, it extracted 2031 genes, with 1162 positive, 489 negative, and 1439 ambiguous. Notably, 915 positive genes were newly identified, enhancing understanding and prompting further research into Alzheimer's genetic factors, while the ambiguous genes needs additional investigation. https://lnkd.in/gSDvGQt7
43
6 Comments -
Ebrahim Bagheri
Responsible and Reproducible Machine Learning (Festschrift for Prof. Stan Matwin) https://lnkd.in/gN_rRSY3 We are pleased to invite submissions for a special issue of Computational Intelligence on “Responsible and Reproducible Machine Learning.” In the era of rapid advancements in artificial intelligence and machine learning, ensuring that these technologies are developed and deployed in a responsible, transparent, and reproducible manner is of paramount importance. This special issue aims to gather pioneering research that addresses these critical dimensions, fostering the creation of machine learning systems that are not only effective but also trustworthy and ethically sound. We seek contributions that provide deep insights into the theoretical underpinnings, innovative methodologies, and practical applications of responsible and reproducible machine learning. We welcome original research articles, comprehensive reviews, and case studies that delve into the multifaceted aspects of this field. Our goal is to create a platform for interdisciplinary dialogue and to highlight innovative approaches that advance the principles of explainability, transparency, ethical AI, and privacy-preserving analytics. Submissions that explore novel frameworks, propose new models, or offer empirical evaluations are particularly encouraged. By bringing together diverse perspectives and cutting-edge research, this special issue aims to drive forward the discourse on how to responsibly benefit from the power of machine learning technologies.Topics of interest: # Methods and frameworks for ensuring explainability in machine learning models; # Techniques for enhancing transparency in machine learning processes; # Ethical considerations and guidelines for responsible AI development; # Approaches to privacy-preserving analytics in machine learning; # Case studies on the implementation of responsible and reproducible machine learning in various industries; # Cross-disciplinary approaches to integrating ethical principles in machine learning; # Evaluation metrics and benchmarks for reproducibility in machine learning research; # Impact of regulatory frameworks on the development and deployment of machine learning systems; # Algorithmic fairness and bias mitigation in machine learning models; # Verification and validation of machine learning systems for reproducibility; # Design and implementation of ethical AI frameworks and toolkits; # User-centric approaches to explainability and transparency in AI systems. Guest Editors (in alphabetical order): Dr. Ebrahim Bagheri Dr. Marina Sokolova Dr. Sebastien Gambs Dr. Nathalie Japcowicz Dr. Amílcar Soares Editor-in-Chief: Dr. Diana Inkpen Important dates: Submission deadline: 30 September 2024 Initial notification: 15 November 2024 Revisions due: 15 December 2024 Notification: 15 January 2025 CAIAC - Canadian Artificial Intelligence Association Wiley
45
3 Comments -
Sunayana Sitaram
PARIKSHA update! TLDR; We did 90k human evaluations (thanks to Karya!) across 10 Indian languages and 30 models (probably the largest multilingual human evaluation of LLMs so far?). GPT-4o consistently performs best and Llama-3 70B is close behind. We also performed LLM-based evaluations and found that agreement with human evaluation is higher in some cases, particularly pairwise evaluation in some languages, while agreement on prompts containing cultural nuances is lower (so don't use LLM evaluators for this yet!). Abstract: Evaluation of multilingual Large Language Models (LLMs) is challenging due to a variety of factors -- the lack of benchmarks with sufficient linguistic diversity, contamination of popular benchmarks into LLM pre-training data and the lack of local, cultural nuances in translated benchmarks. In this work, we study human and LLM-based evaluation in a multilingual, multi-cultural setting. We evaluate 30 models across 10 Indic languages by conducting 90K human evaluations and 30K LLM-based evaluations and find that models such as GPT-4o and Llama-3 70B consistently perform best for most Indic languages. We build leaderboards for two evaluation settings - pairwise comparison and direct assessment and analyse the agreement between humans and LLMs. We find that humans and LLMs agree fairly well in the pairwise setting but the agreement drops for direct assessment evaluation especially for languages such as Bengali and Odia. We also check for various biases in human and LLM-based evaluation and find evidence of self-bias in the GPT-based evaluator. Our work presents a significant step towards scaling up multilingual evaluation of LLMs. Preprint: https://lnkd.in/g_m5beGJ We will continue to add more prompts, models and languages in future rounds of Pariksha. Work done with the fantastic team at Microsoft Research India and Karya - Ishaan Watts, Varun Gumma, Aditya Yadavalli, Vivek Seshadri, Swami Manohar. #multilingual #evaluation #genai #indic
351
8 Comments -
Sreekanth Madisetty, PhD
IndicGenBench Google Research India recently released IndicGenBench, a multilingual benchmark to evaluate generation capabilities of LLMs on 29 Indic languages spanning 13 writing scripts and 4 language families. Extended the datasets in Cross-lingual Summarization, Machine Translation, Multi-lingual Question Answering, and Cross-lingual Question Answering, the team has collected human translations for English examples into target Indic languages, thereby extending the scope and applicability of evaluation metrics in this domain. One of the key insights from their study is the analysis of token fertility across all Indic languages within IndicGenBench. Token fertility, representing the average number of sub-words that a word is broken down into by the tokenizer, varies significantly across languages Some languages have simple breakdowns, while others are more complex. Now, why does this matter? Well, it affects how well the language models work. Languages with more complex breakdowns might struggle because they can't use as many examples to learn from. They found that languages with simpler breakdowns can use more examples effectively compared to those with complex breakdowns. #LLMs #GenAI #IndicGenBench #AI #IndicLanguages #IndicDatasets #multilingual
15
1 Comment -
ArunKumar R
Transformers Transformers has been a major milestone in the field of NLP and heavily used in Generative AI. There are 3 variants of transformer based models and they are 1. Encoder-only 2. Decoder-only 3. Encoder-Decoder Encoder only models: These are also called as autoencoders and these are pretrained using a technique called masked language modeling. A text with a random masked token is sent to the model to predict the masked token. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't _____ at the sign, you will get a ticket. The model is expected to predict the token (word) "stop" These models use bi-directional representations of the input to better understand the full context of a token. Examples of encoder models: BERT Family models Decoder only models: These models are called autoregressive models and are pretrained using a technique called causal language modeling. These models predict the next token using the previous tokens. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't ______" The model will still try to predict the word "stop", but only based on previous tokens. These models are used for generative tasks, including question answering. Examples of decoder only models: GPT Family, Falcon, LLama models Encoder decoder models: These models are called sequence to sequence models and the pretraining objective varies from model to model. For example the popular FLAN - T5 uses a consecutive multitoken masking called span corruption. For example, consider the text "If you don't stop at the sign, you will get a ticket" The training input that will be passed on the encoder model is "If you don't _____ _______ the sign, you will get a ticket. The model will try to predict the tokens "stop at" These models are good at translation tasks. Examples: T5 family In all the above explanations we just took one sentence of text. The LLMs you see in the market are trained on huge volumes of text available over the internet. #LLM #encoder #decoder #transformer #genai
3
-
Mohammed Zaki
Check out our recent work on Triplet Graph Transformers, to be presented at ICML'24 by Md Shamim Hussain, Mohammed Zaki, and Dharmashankar Subramanian. Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers https://lnkd.in/e3mz5CuF Abstract: Graph transformers typically lack third-order interactions, limiting their geometric understanding which is crucial for tasks like molecular geometry prediction. We propose the Triplet Graph Transformer (TGT) that enables direct communication between pairs within a 3-tuple of nodes via novel triplet attention and aggregation mechanisms. TGT is applied to molecular property prediction by first predicting interatomic distances from 2D graphs and then using these distances for downstream tasks. A novel three-stage training procedure and stochastic inference further improve training efficiency and model performance. Our model achieves new state-of-the-art (SOTA) results on open challenge benchmarks PCQM4Mv2 and OC20 IS2RE. We also obtain SOTA results on QM9, MOLPCBA, and LIT-PCBA molecular property prediction benchmarks via transfer learning. We also demonstrate the generality of TGT with SOTA results on the traveling salesman problem (TSP).
79
1 Comment -
Kostas Alexis
Excited to share the topics of our two just accepted IEEE/RSJ IROS 2024 papers, as both represent key directions for our efforts on resilient autonomy. 1. Marvin Chayton Harms, Mihir Kulkarni, Nikhil Vijay Khedekar, Martin Jacquet, Kostas Alexis, "Neural Control Barrier Functions for Safe Navigation", 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024), Abu Dhabi, UAE Video: https://lnkd.in/dcYbnzne >> In this work we introduce the concept of Neural CBFs for safe autonomous navigation exploiting exteroceptive data and without the need for a map or consistent odometry estimation. 2. Mohit Singh, Kostas Alexis, "Online Refractive Camera Model Calibration in Visual Inertial Odometry", 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024), Abu Dhabi, UAE Video: https://lnkd.in/d_iyQdgp >> In this work we present online visual-inertial odometry for underwater systems that explicitly accounts for refractive effects and co-estimates the refractive index of a medium in real-time thus eliminating the need for medium-specific camera calibration. #robotics #autonomy #ntnu
244
1 Comment -
Raghul Asokan
Hello All! It is great to be back with another article in my series “Neural Networks Intuitions - 18. Generative Pretrained Transformer(GPT) Series". The past 1.5-2 years have been very much dominated by Large Language Models(LLMs) ever since the release of GPT-3.5 and GPT-4, their fantastic ability to solve wide range of natural language tasks — right from question answering, summarization, visual QA and even code generation. And anyone who has followed this recent trend would know about the talks regarding the emergence of AGI which led to two school of thoughts — one set of people who truly believe that Language Models can lead us to AGI, while the other set sharply denies this statement by arguing that although there are a few signs of reasoning, there is no true extrapolation as such and all of it comes down to the training data(and its scale). The latter group's criticism is that unless there is visibility into the training data distribution and having a test set(without contaminating the test data with train set) along with a proper evaluation methodology, it is unfair to say that these LLMs can actually lead to AGI. Now in order to truly validate some of these closed source GPT models(such as GPT-4, Gemini, Claude etc), we need to be aware of the training data distribution — which is highly unlikely. But another way to understand if LLMs or more specifically instructed tuned LLMs can actually generalize to novel distribution or not is by actually digging deep into the evolution of these architectures/algorithms — more specifically the evolution of GPT models. In this article, I will be going over a series of GPT papers - GPT-1, GPT-2, GPT-3 and InstructGPT, and try to understand what makes these models truly generalizable or universal. Do read and comment your thoughts/opinions :) https://lnkd.in/di8_VpJA #neuralnetworks #largelanguagemodels #GPT #incontextlearning #nlp
23
1 Comment -
Liang Sun
Our latest paper, "Distributed Matching-by-Clone Hungarian-Based Algorithm (DMCHBA) for Task Allocation of Multi-agent Systems," has just been published in IEEE Transactions on Robotics! This work introduces a scalable distributed task allocation algorithm designed to operate efficiently across multi-agent systems with any communication network topology. This paper has been selected for presentation at the 2024 IEEE International Conference on Robotics and Automation in Japan! DMCHBA is tailored for complex multi-agent environments, ensuring flexibility across various network topologies. Through rigorous Monte Carlo simulations, DMCHBA has demonstrated superior performance in terms of both overall cost and runtime compared to state-of-the-art algorithms. We provide a detailed analysis of the algorithmic computational complexity and communication complexity, reinforcing the robustness of DMCHBA. This achievement is a testament to the hard work and dedication of Arezoo Samiei, PhD. We believe that these findings will make a significant impact on task allocation processes in robotic systems, pushing the boundaries of what's possible in automated environments. Read the paper here: https://lnkd.in/g7RAp5fS. We are working to publish the code on MATLAB File Exchange. More to come!
29
6 Comments -
Arvind Narayanan
Excited to announce that our REFORMS checklist for ML-based science is now out in Science Advances! We review common errors in ML for science, create a checklist of 32 items applicable across disciplines, and provide in-depth guidelines for each item. https://lnkd.in/dZHFswzk It was a pleasure working on REFORMS with a cross-disciplinary team of computer scientists, sociologists, mathematicians, economists, and health researchers. If you are doing ML-based science, we hope you find it useful and would love to hear your feedback. The authors are Sayash Kapoor, Emily Cantrell, Kenny Peng, Hien Pham, Christopher Bail, Odd Erik Gundersen, Jake Hofman, Jessica Hullman, Michael Lones, Momin M. Malik, Priyanka Nanayakkara, Russell Poldrack, Deborah Raji, Michael Roberts, Matthew Salganik, Marta Serra-Garcia, Brandon Stewart, Gilles Vandewiele, and me. Here are the videos from the Princeton University workshop on The Reproducibility Crisis in ML‑based Science that led to the paper: https://lnkd.in/dAi5Skds Finally, here's a nice writeup from Princeton Engineering. "Science has an AI problem. This group says they can fix it." https://lnkd.in/di5xuw3D
49
3 Comments -
Patrick Hall
Interesting analysis of real-world #AI incident reports over time by Sherry Chen, Steven Shen, and shyam sivasubramanian: https://lnkd.in/euEBcefx. These student authors were able to: - Define clusters for AI incidents over roughly 10 years, with particularly distinct groups for driverless car issues and chatbot flubs. - Define axes of physical vs. digital and narrow vs. broad scope for grouping incident reports. One thing I'll highlight is that issues relating to superintelligence or existential risk don't appear in empirical AI incident report data. Broken chatbots, extortion/scams/snakeoil, bias and driverless car crashes do. While some AI systems' capabilities for disinformation or surveillance are dangerous, most systems are not risky because they are super powerful. They're risky because they are rushed to market, fundamentally flawed or misaligned in their application context, and because we are still in the early, immature days of this field. Also, our data is freely available for you to come to your own conclusions: https://lnkd.in/enPsjsji. cc: Digital Safety Research Institute, The Center for Advancing Safety of Machine Intelligence (CASMI), Missy Cummings, Daniel Atherton
62
8 Comments -
Michael Bremner
It's an active day on the arXiv for the QSI team! Two papers emerging from the DARPA Quantum Benchmarking program (https://lnkd.in/guubMFvf and https://lnkd.in/gs9KgdT3) and another (https://lnkd.in/gJC6wbRR) by Zixin Huang a fellow at Macquarie and long-term visitor to our Centre. In "Quantum computing for corrosion-resistant materials and anti-corrosive coatings design" our team worked with collaborators at HRL, Boeing, Wisconsin, and MIT LL to investigate how quantum computers could be used in existing workflows to discover new anti-corrosives. This has been a really fun project where we've been able to learn a lot from subject matter experts in materials design and modelling to understand how they are currently attacking this difficult, but highly valuable, problem. We look at two different types of corrosion, potential materials, and corresponding workflows and estimate the cost of using QPE as a key subroutine in these workflows using a new package - pyLIQTR https://lnkd.in/g8WrCySs and a host of other tools to set up the instances. Not surprisingly, we find that the costs are high! We intentionally focussed on improvements to existing workflows and techniques so that we could get an understanding of the costs of heuristic methods without dependency of new breakthroughs in quantum algorithms in this space. I personally believe that there is a lot of opportunity to bring down the cost of such computations through improved workflows, algorithms, and analysis. This is a great challenge problem that we hope will motivate to improvements in quantum algorithms and software!
33
1 Comment -
Gregory Mermoud
Very insightful work by Anthropic’s interpretability team. And an amazing paper, with outstanding writing and figures. The idea is very simple: interpret LLMs by leveraging sparse autoencoders as surrogate models of the MLP of transformer blocks, which allow one to disambiguate the superposition of features captured by a single neuron. A simple idea, but a very careful and complex execution, as it is often the case in our line of work. The paper goes into many details and provide a large array of insights, although the gist of the implementation remains obfuscated due to the closed source nature of Claude. Too bad, because this is the kind of work that we need to better understand and eventually trust LLMs. This is demonstrated by the authors in the section ‘Influence on Behavior’, where they show that clamping some features to either high or low value during inference is “remarkably effective at modifying model outputs in specific, interpretable ways”. Hopefully this kind of work is going to be replicated and generalized to open-weights models, such that we have new ways to steer their behavior. https://lnkd.in/eVym7f_f #interpretability #xai #explainableai #steerableai #anthropic #claude #anthropic
3
-
Laura Assis
I am proud to share that our paper, "Greedy Recursive Spectral Bisection for Modularity-Bound Hierarchical Divisive Community Detection," has been published in Statistics and Computing. Thanks to an agreement between Springer and FCT - Portugal, you can access the full article for free! Dive into our innovative approach for community detection and explore the detailed methodology and results. Check it out here: https://rdcu.be/dL8bM #StatisticsAndComputing #CommunityDetection #Research #OpenAccess #Springer #PPCIC #PPPRO #CEFETRJ #DataScience
27
-
Noam Zelcer
Sebastian's most recent paper on the functional interaction between SPRING and S1P is now online in MCB. A great collaborative effort that demonstrates that SPRING is a specific factor for S1P-mediated activation of SREBPs and lipid metabolism. Stay tuned... more to come. Have fun reading and let us know what you think of this #Zelcerlab #Lipidmetabolism #cholesterol #SPRING
49
3 Comments
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore MoreOthers named Anima Anandkumar
1 other named Anima Anandkumar is on LinkedIn
See others named Anima Anandkumar