“Annie possesses the acumen to effectively strategize and plan research projects of any scale and complexity! I had the pleasure of working with her for more than a year at VerticalScope, where she led the R&D team's research efforts on Machine Learning, Natural Language Processing, and Social Network Analysis. I closely worked with her on projects involving Named Entity Recognition and Linking, Content Recommender, and User Influence. Throughout all these projects, she was meticulous and organized in her research methods, where she carefully filtered, analyzed and summarized state-of-the-art research advancements. Annie also came up with creative and novel research ideas for the team to quickly test concepts and implement prototypes. Moreover, she has the knack for laying out well-planned project roadmaps which were pivotal for the engineering team's success in all these projects. Annie was never afraid to take initiative; I was especially impressed with her proactiveness in the Topic Modelling project wherein she reached out to domain experts outside of our team in order to get valuable assistance and feedback on our work. Annie has a unique mix of skills involving people, processes and technology that will make her a great asset for any company!”
About
Activity
-
Congratulations to E. David Guzmán Ramírez on being featured by the Master of Science in Applied Computing (MScAC). It was great to experience NAACL…
Congratulations to E. David Guzmán Ramírez on being featured by the Master of Science in Applied Computing (MScAC). It was great to experience NAACL…
Shared by Annie Lee
-
Very proud of my MBZUAI Undergraduate Research Internship Program (UGRIP) team for receiving the best team award! Congrats to Annant Jain, Lihan…
Very proud of my MBZUAI Undergraduate Research Internship Program (UGRIP) team for receiving the best team award! Congrats to Annant Jain, Lihan…
Liked by Annie Lee
Experience & Education
Publications
-
Discovering co-occurring patterns and their biological significance in protein families
BMC Bioinformatics
The large influx of biological sequences poses the importance of identifying and correlating conserved regions in homologous sequences to acquire valuable biological knowledge. These conserved regions contain statistically significant residue associations as sequence patterns. Thus, patterns from two conserved regions co-occurring frequently on the same sequences are inferred to have joint functionality. A method for finding conserved regions in protein families with frequent co-occurrence…
The large influx of biological sequences poses the importance of identifying and correlating conserved regions in homologous sequences to acquire valuable biological knowledge. These conserved regions contain statistically significant residue associations as sequence patterns. Thus, patterns from two conserved regions co-occurring frequently on the same sequences are inferred to have joint functionality. A method for finding conserved regions in protein families with frequent co-occurrence patterns is proposed. The biological significance of the discovered clusters of conserved regions with co-occurrences patterns can be validated by their three-dimensional closeness of amino acids and the biological functionality found in those regions as supported by published work.
Thus, our co-occurrence clustering algorithm can efficiently find and rank conserved regions that contain patterns that frequently co-occurring on the same proteins. Co-occurring patterns are biologically significant due to their three-dimensional closeness and other evidences reported in literature. These results play an important role in drug discovery as biologists can quickly identify the target for drugs to conduct detailed pre-clinical studies.
Other authors -
-
Ranking and compacting binding segments of protein families using aligned pattern clusters
Proteome Science
Discovering sequence patterns with variation can unveil functions of a protein family that are important for drug discovery. Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search, called motif finding in Bioinformatics, is used. However, at present, combinatorial algorithms result in large sets of solutions, and probabilistic models require a richer representation of the amino acid associations. To overcome these…
Discovering sequence patterns with variation can unveil functions of a protein family that are important for drug discovery. Exploring protein families using existing methods such as multiple sequence alignment is computationally expensive, thus pattern search, called motif finding in Bioinformatics, is used. However, at present, combinatorial algorithms result in large sets of solutions, and probabilistic models require a richer representation of the amino acid associations. To overcome these shortcomings, we present a method for ranking and compacting these solutions in a new representation referred to as Aligned Pattern Clusters (APCs). To tackle the problem of a large solution set, our method reveals a reduced set of candidate solutions without losing any information. To address the problem of representation, our method captures the amino acid associations and conservations of the aligned patterns. Our algorithm renders a set of APCs in which a set of patterns is discovered, pruned, aligned, and synthesized from the input sequences of a protein family.
-
sGAL: a computational method for finding surface exposed sites in proteins suitable for Cys-mediated cross-linking
Bioinformatics
sGAL is a computer program designed to find pairs of sites suitable for introducing chemical cross-links into proteins. sGAL takes a protein structure file in PDB format as input, truncates each residue sequentially to its gamma side chain atom to mimic mutation to Cys, and calculates the exposed surface area of the gamma atom. The user then inputs the minimum and maximum lengths of the cross-linker. sGAL provides as output pairs of residues that would have exposed gamma atom separations that…
sGAL is a computer program designed to find pairs of sites suitable for introducing chemical cross-links into proteins. sGAL takes a protein structure file in PDB format as input, truncates each residue sequentially to its gamma side chain atom to mimic mutation to Cys, and calculates the exposed surface area of the gamma atom. The user then inputs the minimum and maximum lengths of the cross-linker. sGAL provides as output pairs of residues that would have exposed gamma atom separations that fall within this range. Furthermore, if a line joining the pair of gamma atoms contacts more than a given number of buried atoms, that pair is discarded. In this way, sites for which the protein would sterically interfere with cross-linking are avoided.
AVAILABILITY:
http://www.chem.utoronto.ca/staff/GAW/links.html; (Surface Racer is also required see: http://monte.biochem.wisc.edu/~tsodikov/surface.html).Other authors -
Patents
-
Aligning and clustering sequence patterns to reveal classificatory functionality of sequences
Filed CA 05202585-27USPR
Other inventors -
Honors & Awards
-
Summer Institute in Taiwan,
Natural Sciences and Engineering Research Council of Canada and National Science Council in Taiwan
The Summer Programs in provide graduate students in science and engineering with a hands-on research experience and an introduction to a different culture, language, and university research system.
-
Graduate Scholarship
University of Waterloo
The University of Waterloo Graduate Scholarships are disbursed by departments/units normally to graduate students registered full-time with a first-class (80%) standing. No application is required, departments will nominate students based on the number of awards available. The awards may be given as UW Graduate Entrance scholarships, UW Merit Scholarships or UW Scholarships. These awards are normally paid across three terms. The minimum value of the award is normally $1,000 per term, but may…
The University of Waterloo Graduate Scholarships are disbursed by departments/units normally to graduate students registered full-time with a first-class (80%) standing. No application is required, departments will nominate students based on the number of awards available. The awards may be given as UW Graduate Entrance scholarships, UW Merit Scholarships or UW Scholarships. These awards are normally paid across three terms. The minimum value of the award is normally $1,000 per term, but may vary.
-
Post Graduate Scholarship
Natural Sciences and Engineering Research Council of Canada
NSERC Postgraduate Scholarships-Doctoral Program (PGS D) provide financial support to high calibre scholars who are engaged in a doctoral program in the natural sciences or engineering. The CGS D will be offered to the top-ranked applicants and the next tier of meritorious applicants will be offered an NSERC PGS D. This support allows these scholars to fully concentrate on their studies and seek out the best research mentors in their chosen fields.
-
President’s Graduate Scholarship
University of Waterloo
Waterloo continues to attract the best students by providing the President's Graduate Scholarship (PGS) (up to $10,000 per year) to both incoming and continuing graduate students receiving Tri-agency (NSERC/SSHRC/CIHR) and provincial (OGS) competition-based scholarships. The PGS is provided in each year the eligible scholarship is held while the recipient is registered in a Master's or PhD program at the University of Waterloo.
-
Provost’s Doctoral Entrance Award for Women
University of Waterloo
The main purpose of this award is to provide any outstanding full-time female doctoral student (Canadian citizen, permanent resident or student on a study permit) an entrance scholarship in the amount of $5,000 for one year. The award is normally paid across two terms.
Languages
-
English
Full professional proficiency
-
Mandarin
Limited working proficiency
Recommendations received
3 people have recommended Annie
Join now to viewMore activity by Annie
-
Apply for Lacuna Fund grant to create language data for African and Latin American languages backed Google Google.org
Apply for Lacuna Fund grant to create language data for African and Latin American languages backed Google Google.org
Liked by Annie Lee
-
WMT24 submission week just started! We released General MT testsets. More information at https://lnkd.in/ep_BK-fm What's new this year: -…
WMT24 submission week just started! We released General MT testsets. More information at https://lnkd.in/ep_BK-fm What's new this year: -…
Liked by Annie Lee
-
Very proud of the work we are releasing today with Aakanksha . Arash Ahmadian Beyza Ermis Seraphina Goldfarb-Tarrant Julia Kreutzer Marzieh Fadaee ✨…
Very proud of the work we are releasing today with Aakanksha . Arash Ahmadian Beyza Ermis Seraphina Goldfarb-Tarrant Julia Kreutzer Marzieh Fadaee ✨…
Liked by Annie Lee
-
Huge congratulations to Annie Tang who was recently awarded the University of Toronto Department of Statistical Sciences Data Science Award. This…
Huge congratulations to Annie Tang who was recently awarded the University of Toronto Department of Statistical Sciences Data Science Award. This…
Liked by Annie Lee
-
This week, we will focus on a connection that I have found interesting recently involving stochastic gradient descent (SGD). We have talked about SGD…
This week, we will focus on a connection that I have found interesting recently involving stochastic gradient descent (SGD). We have talked about SGD…
Liked by Annie Lee
People also viewed
Explore collaborative articles
We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.
Explore More