PhD Research Intern - Vision Language Mo
PhD Research Intern - Vision Language Mo
Dolby Laboratories
Sunnyvale, CA
See who Dolby Laboratories has hired for this role
Join the leader in entertainment innovation and help us design the future. The Dolby U internship program offers impactful, project-based work experience in a collaborative, creative environment where you work side by side with industry leaders. Amplify your insatiable curiosity by implementing real-world solutions that revolutionize how people communicate and how entertainment is created, delivered, and enjoyed worldwide. We offer a collegial culture, challenging projects, and excellent compensation and benefits, not to mention a Flex Work approach that is truly flexible to support where, when, and how you do your best work. For any student seeking to gain invaluable expertise through meaningful, personal contributions, we invite you to join us in continuing to design a future where technology meets entertainment!
The Advanced Technology Group (ATG) is the research division of the company. ATG’s mission is to look ahead, deliver insights, and innovate technological solutions that will fuel Dolby’s continued growth. Our researchers have a broad range of expertise related to computer science and electrical engineering, such as AI/ML, algorithms, digital signal processing, audio engineering, image processing, computer vision, data science & analytics, distributed systems, cloud, edge & mobile computing, computer networking, and IoT.
About The Role
As a Research Intern – Vision Language Models at Dolby, you will have the opportunity to work
on cutting edge vision language technologies. With the guidance of Dolby’s leading media
technology experts, you will be solving a real-world vision language problem aiming to generate
customized textual output based on context in visual, textual, and other modalities to create
advanced consumer experiences in multimedia entertainment.
This opportunity will be based out of our research office in Sunnyvale, California.
What are we looking for in candidates?
We are seeking current PhD students with backgrounds and experiences in vision language
models, specifically for video analysis, understanding, question answering and/or text
generation. To be eligible, you should have completed at least one year of your doctoral
program. Along with solid technical skills, candidates should demonstrate problem-solving and
analytical abilities, good communication and collaboration skills, a curiosity for how and why
things work as they do, and a passion for video, or natural language technology. You have a
desire to bring in new ideas and are open to learning from others.
Summary of Position:
We are a key research team within Dolby’s Advanced Technology Group, focused on creating
cutting edge imaging, computer vision, vision language, and multimodal technologies that drive
next generation consumer experiences. We are looking for strong candidates with an interest in
one or more of the following areas:
Requirements:
reviewed journals and conferences.
considered, we recommend submitting your application by June 28, 2024.
Eligibility:
Currently enrolled in a doctorate program in Computer Science, Electrical Engineering, Computer
Engineering, or a related field. Must be available to work full-time Monday – Friday for 12 weeks
between September 2024 – December 2024.
Start date for the internship is as follows: (*note* this date is not flexible)
Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12
Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.
]]>
The Advanced Technology Group (ATG) is the research division of the company. ATG’s mission is to look ahead, deliver insights, and innovate technological solutions that will fuel Dolby’s continued growth. Our researchers have a broad range of expertise related to computer science and electrical engineering, such as AI/ML, algorithms, digital signal processing, audio engineering, image processing, computer vision, data science & analytics, distributed systems, cloud, edge & mobile computing, computer networking, and IoT.
About The Role
As a Research Intern – Vision Language Models at Dolby, you will have the opportunity to work
on cutting edge vision language technologies. With the guidance of Dolby’s leading media
technology experts, you will be solving a real-world vision language problem aiming to generate
customized textual output based on context in visual, textual, and other modalities to create
advanced consumer experiences in multimedia entertainment.
This opportunity will be based out of our research office in Sunnyvale, California.
What are we looking for in candidates?
We are seeking current PhD students with backgrounds and experiences in vision language
models, specifically for video analysis, understanding, question answering and/or text
generation. To be eligible, you should have completed at least one year of your doctoral
program. Along with solid technical skills, candidates should demonstrate problem-solving and
analytical abilities, good communication and collaboration skills, a curiosity for how and why
things work as they do, and a passion for video, or natural language technology. You have a
desire to bring in new ideas and are open to learning from others.
Summary of Position:
We are a key research team within Dolby’s Advanced Technology Group, focused on creating
cutting edge imaging, computer vision, vision language, and multimodal technologies that drive
next generation consumer experiences. We are looking for strong candidates with an interest in
one or more of the following areas:
- Vision language models for content analysis and generation (video, and/or text).
- Video content analysis, understanding and information retrieval.
- Text generation using large language model (LLM), or multimodal LLM based on context
Requirements:
- PhD students in Artificial Intelligence, Electrical Engineering, Computer Science, or
- Proven ability to pursue new areas of vision language research for AI, or data analysis
reviewed journals and conferences.
- Experience as a researcher, including internships, full-time, or at a lab.
- Proficiency in Python and PyTorch.
- Creating demos and prototypes for research applications.
- Developing vision language models for image/video analysis, understanding,
- Experience in finetuning multimodal LLMs and/or multimodal foundation models.
- Writing technical reports and/or publications.
- First-author publications at peer-reviewed conferences or journals.
considered, we recommend submitting your application by June 28, 2024.
Eligibility:
Currently enrolled in a doctorate program in Computer Science, Electrical Engineering, Computer
Engineering, or a related field. Must be available to work full-time Monday – Friday for 12 weeks
between September 2024 – December 2024.
Start date for the internship is as follows: (*note* this date is not flexible)
- Monday, September 23, 2024
Dolby will consider qualified applicants with criminal histories in a manner consistent with the requirements of San Francisco Police Code, Article 49, and Administrative Code, Article 12
Equal Employment Opportunity:
Dolby is proud to be an equal opportunity employer. Our success depends on the combined skills and talents of all our employees. We are committed to making employment decisions without regard to race, religious creed, color, age, sex, sexual orientation, gender identity, national origin, religion, marital status, family status, medical condition, disability, military service, pregnancy, childbirth and related medical conditions or any other classification protected by federal, state, and local laws and ordinances.
]]>
-
Seniority level
Not Applicable -
Employment type
Full-time -
Job function
Other -
Industries
Broadcast Media Production and Distribution, Computers and Electronics Manufacturing, and Entertainment Providers
Referrals increase your chances of interviewing at Dolby Laboratories by 2x
See who you knowGet notified about new Doctoral Researcher jobs in Sunnyvale, CA.
Sign in to create job alertSimilar jobs
People also viewed
-
Food Safety Scientist
Food Safety Scientist
-
Food Scientist
Food Scientist
-
Associate Scientist, In Vivo Research
Associate Scientist, In Vivo Research
-
Scientist II, Translational Sciences
Scientist II, Translational Sciences
-
Biologist/Physical Scientist
Biologist/Physical Scientist
-
Biomedical Scientist
Biomedical Scientist
-
Associate Research Scientist
Associate Research Scientist
-
Early Career Environmental Scientist
Early Career Environmental Scientist
-
Entry Level Scientist
Entry Level Scientist
-
Associate Scientist, In Vivo Pharmacology
Associate Scientist, In Vivo Pharmacology
Looking for a job?
Visit the Career Advice Hub to see tips on interviewing and resume writing.
View Career Advice Hub