Skip to main content

Showing 1–50 of 107 results for author: Tseng, Y

  1. arXiv:2407.04245  [pdf, other

    cs.CV

    Every Pixel Has its Moments: Ultra-High-Resolution Unpaired Image-to-Image Translation via Dense Normalization

    Authors: Ming-Yang Ho, Che-Ming Wu, Min-Sheng Wu, Yufeng Jane Tseng

    Abstract: Recent advancements in ultra-high-resolution unpaired image-to-image translation have aimed to mitigate the constraints imposed by limited GPU memory through patch-wise inference. Nonetheless, existing methods often compromise between the reduction of noticeable tiling artifacts and the preservation of color and hue contrast, attributed to the reliance on global image- or patch-level statistics in… ▽ More

    Submitted 5 July, 2024; originally announced July 2024.

    Comments: Accepted to ECCV 2024

  2. arXiv:2406.07237  [pdf, other

    eess.AS cs.SD

    CodecFake: Enhancing Anti-Spoofing Models Against Deepfake Audios from Codec-Based Speech Synthesis Systems

    Authors: Haibin Wu, Yuan Tseng, Hung-yi Lee

    Abstract: Current state-of-the-art (SOTA) codec-based audio synthesis systems can mimic anyone's voice with just a 3-second sample from that specific unseen speaker. Unfortunately, malicious attackers may exploit these technologies, causing misuse and security issues. Anti-spoofing models have been developed to detect fake speech. However, the open question of whether current SOTA anti-spoofing models can e… ▽ More

    Submitted 11 June, 2024; originally announced June 2024.

    Comments: Accepted to Interspeech 2024, project page: https://codecfake.github.io/

  3. arXiv:2406.05755  [pdf, other

    cs.CV

    A DeNoising FPN With Transformer R-CNN for Tiny Object Detection

    Authors: Hou-I Liu, Yu-Wen Tseng, Kai-Cheng Chang, Pin-Jyun Wang, Hong-Han Shuai, Wen-Huang Cheng

    Abstract: Despite notable advancements in the field of computer vision, the precise detection of tiny objects continues to pose a significant challenge, largely owing to the minuscule pixel representation allocated to these objects in imagery data. This challenge resonates profoundly in the domain of geoscience and remote sensing, where high-fidelity detection of tiny objects can facilitate a myriad of appl… ▽ More

    Submitted 15 June, 2024; v1 submitted 9 June, 2024; originally announced June 2024.

    Comments: The article is accepted by IEEE Transactions on Geoscience and Remote Sensing. Our code will be available at https://github.com/hoiliu-0801/DNTR

  4. arXiv:2406.01356  [pdf, other

    cs.CV

    MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images

    Authors: Ke-Lei Wang, Pin-Hsuan Chou, Young-Ching Chou, Chia-Jen Liu, Cheng-Kuan Lin, Yu-Chee Tseng

    Abstract: While there are a lot of models for instance segmentation, PolarMask stands out as a unique one that represents an object by a Polar coordinate system. With an anchor-box-free design and a single-stage framework that conducts detection and segmentation at one time, PolarMask is proved to be able to balance efficiency and accuracy. Hence, it can be easily connected with other downstream real-time a… ▽ More

    Submitted 3 June, 2024; originally announced June 2024.

  5. arXiv:2406.01171  [pdf, other

    cs.CL

    Two Tales of Persona in LLMs: A Survey of Role-Playing and Personalization

    Authors: Yu-Min Tseng, Yu-Chao Huang, Teng-Yun Hsiao, Wei-Lin Chen, Chao-Wei Huang, Yu Meng, Yun-Nung Chen

    Abstract: The concept of persona, originally adopted in dialogue literature, has re-surged as a promising framework for tailoring large language models (LLMs) to specific context (e.g., personalized search, LLM-as-a-judge). However, the growing research on leveraging persona in LLMs is relatively disorganized and lacks a systematic taxonomy. To close the gap, we present a comprehensive survey to categorize… ▽ More

    Submitted 26 June, 2024; v1 submitted 3 June, 2024; originally announced June 2024.

    Comments: 8-page version

  6. arXiv:2405.13720  [pdf, other

    cond-mat.str-el

    Spin-orbital excitations encoding the magnetic phase transition in the van der Waals antiferromagnet FePS$_{3}$

    Authors: Yuan Wei, Yi Tseng, Hebatalla Elnaggar, Wenliang Zhang, Teguh Citra Asmara, Eugenio Paris, Gabriele Domaine, Vladimir N. Strocov, Luc Testa, Virgile Favre, Mario Di Luca, Mitali Banerjee, Andrew R. Wildes, Frank M. F. de Groot, Henrik M. Ronnow, Thorsten Schmitt

    Abstract: In the rich phases of van der Waals (vdW) materials featuring intertwined electronic order and collective phenomena, characterizing elementary dynamics that entail the low-energy Hamiltonian and electronic degrees of freedom is of paramount importance. Here we performed resonant inelastic X-ray scattering (RIXS) to elaborate the spin-orbital ground and excited states of the vdW antiferromagnetic i… ▽ More

    Submitted 22 May, 2024; originally announced May 2024.

  7. arXiv:2405.07006  [pdf, other

    cs.CL

    Word-specific tonal realizations in Mandarin

    Authors: Yu-Ying Chuang, Melanie J. Bell, Yu-Hsiang Tseng, R. Harald Baayen

    Abstract: The pitch contours of Mandarin two-character words are generally understood as being shaped by the underlying tones of the constituent single-character words, in interaction with articulatory constraints imposed by factors such as speech rate, co-articulation with adjacent tones, segmental make-up, and predictability. This study shows that tonal realization is also partially determined by words' m… ▽ More

    Submitted 11 May, 2024; originally announced May 2024.

  8. arXiv:2404.16670  [pdf, other

    cs.CV cs.AI

    EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning

    Authors: Hongxia Xie, Chu-Jun Peng, Yu-Wen Tseng, Hung-Jen Chen, Chan-Feng Hsu, Hong-Han Shuai, Wen-Huang Cheng

    Abstract: Visual Instruction Tuning represents a novel learning paradigm involving the fine-tuning of pre-trained language models using task-specific instructions. This paradigm shows promising zero-shot results in various natural language processing tasks but is still unexplored in vision emotion understanding. In this work, we focus on enhancing the model's proficiency in understanding and adhering to ins… ▽ More

    Submitted 25 April, 2024; originally announced April 2024.

    Comments: Accepted by CVPR 2024

  9. arXiv:2404.10818  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Nature of excitons and their ligand-mediated delocalization in nickel dihalide charge-transfer insulators

    Authors: Connor A. Occhialini, Yi Tseng, Hebatalla Elnaggar, Qian Song, Mark Blei, Seth Ariel Tongay, Valentina Bisogni, Frank M. F. de Groot, Jonathan Pelliciari, Riccardo Comin

    Abstract: The fundamental optical excitations of correlated transition-metal compounds are typically identified with multielectronic transitions localized at the transition-metal site, such as $dd$ transitions. In this vein, intense interest has surrounded the appearance of sharp, below band-gap optical transitions, i.e. excitons, within the magnetic phase of correlated Ni$^{2+}$ van der Waals magnets. The… ▽ More

    Submitted 16 April, 2024; originally announced April 2024.

  10. arXiv:2404.02963  [pdf, other

    cond-mat.str-el cond-mat.mtrl-sci

    Unraveling the Mn $L_3$-edge RIXS spectrum of lightly manganese doped Sr$_{3}$Ru$_{2}$O$_{7}$

    Authors: Wei-Yang Chen, Shih-Wen Huang, Yi Tseng, Wenliang Zhang, Eugenio Paris, Teguh Citra Asmara, Jenn-Min Lee, Thorsten Schmitt, Yu-Cheng Shao, Yi-De Chuang, Byron Freelon, Dao-Xin Yao, Trinanjan Datta

    Abstract: Resonant inelastic x-ray scattering (RIXS) experiment was performed at the Mn $L_3$ edge. A 10 $\%$ Mn-doped Sr$_{3}$Ru$_{2}$O$_{7}$ compound, where the Mn$^{3+}$ ions are in the 3$d^4$ state, were probed for $dd$ excitations. The dilute doping concentration allows one to treat the dopant Mn$^{3+}$ ions as effectively free in the host ruthenium compound. The local nature of $dd$ RIXS spectroscopy… ▽ More

    Submitted 3 April, 2024; originally announced April 2024.

    Comments: 12 pages, 7 figures, see PDF text for full abstract info

  11. Help Supporters: Exploring the Design Space of Assistive Technologies to Support Face-to-Face Help Between Blind and Sighted Strangers

    Authors: Yuanyang Teng, Connor Courtien, David Angel Rios, Yves M. Tseng, Jacqueline Gibson, Maryam Aziz, Avery Reyna, Rajan Vaish, Brian A. Smith

    Abstract: Blind and low-vision (BLV) people face many challenges when venturing into public environments, often wishing it were easier to get help from people nearby. Ironically, while many sighted individuals are willing to help, such interactions are infrequent. Asking for help is socially awkward for BLV people, and sighted people lack experience in helping BLV people. Through a mixed-ability research-th… ▽ More

    Submitted 12 March, 2024; originally announced March 2024.

    Comments: To Appear In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems (Honolulu, HI, USA) Association for Computing Machinery, New York, NY, USA. 24 pages

  12. arXiv:2403.04785  [pdf, other

    cs.CL cs.AI

    Large Language Multimodal Models for 5-Year Chronic Disease Cohort Prediction Using EHR Data

    Authors: Jun-En Ding, Phan Nguyen Minh Thao, Wen-Chih Peng, Jian-Zhe Wang, Chun-Cheng Chug, Min-Chen Hsieh, Yun-Chien Tseng, Ling Chen, Dongsheng Luo, Chi-Te Wang, Pei-fu Chen, Feng Liu, Fang-Ming Hung

    Abstract: Chronic diseases such as diabetes are the leading causes of morbidity and mortality worldwide. Numerous research studies have been attempted with various deep learning models in diagnosis. However, most previous studies had certain limitations, including using publicly available datasets (e.g. MIMIC), and imbalanced data. In this study, we collected five-year electronic health records (EHRs) from… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

  13. arXiv:2403.00728  [pdf

    cond-mat.str-el

    Emergence of interfacial magnetism in strongly-correlated nickelate-titanate superlattices

    Authors: Teguh Citra Asmara, Robert J. Green, Andreas Suter, Yuan Wei, Wenliang Zhang, Grant Harris, Yi Tseng, Tianlun Yu, Davide Betto, Mirian Garcia-Fernandez, Stefano Agrestini, Yannick Maximilian Klein, Neeraj Kumar, Carlos William Galdino, Zaher Salman, Thomas Prokscha, Marisa Medarde, Elisabeth Müller, Yona Soh, Nicholas B. Brookes, Ke-Jin Zhou, Milan Radovic, Thorsten Schmitt

    Abstract: Strongly-correlated transition-metal oxides are widely known for their various exotic phenomena. This is exemplified by rare-earth nickelates such as LaNiO$_{3}$, which possess intimate interconnections between their electronic, spin, and lattice degrees of freedom. Their properties can be further enhanced by pairing them in hybrid heterostructures, which can lead to hidden phases and emergent phe… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

    Comments: 41 pages, 13 figures

  14. arXiv:2402.13257  [pdf, other

    physics.ins-det nucl-ex quant-ph

    Mechanical detection of nuclear decays

    Authors: Jiaxiang Wang, T. W. Penny, Juan Recoaro, Benjamin Siegel, Yu-Han Tseng, David C. Moore

    Abstract: We report the detection of individual nuclear $α$ decays through the mechanical recoil of the entire micron-sized particle in which the decaying nuclei are embedded. Momentum conservation ensures that such measurements are sensitive to any particles emitted in the decay, including neutral particles that may otherwise evade detection with existing techniques. Detection of the minuscule recoil of an… ▽ More

    Submitted 8 July, 2024; v1 submitted 18 January, 2024; originally announced February 2024.

    Comments: 16 pages, 15 figures

    Journal ref: Phys. Rev. Lett. 133, 023602 (2024)

  15. arXiv:2402.03988  [pdf, other

    eess.AS cs.CL cs.SD

    REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

    Authors: Liang-Hsuan Tseng, En-Pei Hu, Cheng-Han Chiang, Yuan Tseng, Hung-yi Lee, Lin-shan Lee, Shao-Hua Sun

    Abstract: Unsupervised automatic speech recognition (ASR) aims to learn the mapping between the speech signal and its corresponding textual transcription without the supervision of paired speech-text data. A word/phoneme in the speech signal is represented by a segment of speech signal with variable length and unknown boundary, and this segmental structure makes learning the mapping between speech and text… ▽ More

    Submitted 28 May, 2024; v1 submitted 6 February, 2024; originally announced February 2024.

  16. arXiv:2401.09758  [pdf, other

    cs.CL

    Resolving Regular Polysemy in Named Entities

    Authors: Shu-Kai Hsieh, Yu-Hsiang Tseng, Hsin-Yu Chou, Ching-Wen Yang, Yu-Yun Chang

    Abstract: Word sense disambiguation primarily addresses the lexical ambiguity of common words based on a predefined sense inventory. Conversely, proper names are usually considered to denote an ad-hoc real-world referent. Once the reference is decided, the ambiguity is purportedly resolved. However, proper names also exhibit ambiguities through appellativization, i.e., they act like common words and may den… ▽ More

    Submitted 18 January, 2024; originally announced January 2024.

  17. arXiv:2312.16771  [pdf, other

    cs.CV

    Scale-Aware Crowd Count Network with Annotation Error Correction

    Authors: Yi-Kuan Hsieh, Jun-Wei Hsieh, Yu-Chee Tseng, Ming-Ching Chang, Li Xin

    Abstract: Traditional crowd counting networks suffer from information loss when feature maps are downsized through pooling layers, leading to inaccuracies in counting crowds at a distance. Existing methods often assume correct annotations during training, disregarding the impact of noisy annotations, especially in crowded scenes. Furthermore, the use of a fixed Gaussian kernel fails to account for the varyi… ▽ More

    Submitted 27 December, 2023; originally announced December 2023.

    Comments: 7 pages, 6 figues. arXiv admin note: text overlap with arXiv:2211.06835

  18. arXiv:2312.02362  [pdf, other

    cs.CV cs.GR

    PointNeRF++: A multi-scale, point-based Neural Radiance Field

    Authors: Weiwei Sun, Eduard Trulls, Yang-Che Tseng, Sneha Sambandam, Gopal Sharma, Andrea Tagliasacchi, Kwang Moo Yi

    Abstract: Point clouds offer an attractive source of information to complement images in neural scene representations, especially when few images are available. Neural rendering methods based on point clouds do exist, but they do not perform well when the point cloud quality is low -- e.g., sparse or incomplete, which is often the case with real-world data. We overcome these problems with a simple represent… ▽ More

    Submitted 21 March, 2024; v1 submitted 4 December, 2023; originally announced December 2023.

    Comments: Project website: https://pointnerfpp.github.io/

  19. arXiv:2311.16553  [pdf, other

    cond-mat.str-el cond-mat.supr-con

    Magnon interactions in a moderately correlated Mott insulator

    Authors: Qisi Wang, S. Mustafi, E. Fogh, N. Astrakhantsev, Z. He, I. Biało, Ying Chan, L. Martinelli, M. Horio, O. Ivashko, N. E. Shaik, K. von Arx, Y. Sassa, E. Paris, M. H. Fischer, Y. Tseng, N. B. Christensen, A. Galdi, D. G. Schlom, K. M. Shen, T. Schmitt, H. M. Rønnow, J. Chang

    Abstract: Quantum fluctuations in low-dimensional systems and near quantum phase transitions have significant influences on material properties. Yet, it is difficult to experimentally gauge the strength and importance of quantum fluctuations. Here we provide a resonant inelastic x-ray scattering study of magnon excitations in Mott insulating cuprates. From the thin film of SrCuO$_2$, single- and bi-magnon d… ▽ More

    Submitted 26 June, 2024; v1 submitted 28 November, 2023; originally announced November 2023.

    Journal ref: Nature Communications 15, 5348 (2024)

  20. arXiv:2311.09023  [pdf, other

    cond-mat.str-el

    Single- and two-particle observables in the Emery model: a dynamical mean-field perspective

    Authors: Yi-Ting Tseng, M. O. Malcolms, Henri Menke, Marcel Klett, Thomas Schäfer, P. Hansmann

    Abstract: We compare the dynamical mean-field descriptions of the single-band Hubbard model and the three-band Emery model at the one- and two-particle level for parameters relevant to high-Tc superconductors. We show that even within dynamical mean-field theory, accounting solely for temporal fluctuations, the intrinsic multi-orbital nature of the Emery model introduces effective non-local correlations. Th… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    Comments: 7 pages, 7 figures

  21. arXiv:2311.08677  [pdf, other

    cs.LG cs.DC cs.IT stat.ML

    Federated Learning for Sparse Principal Component Analysis

    Authors: Sin Cheng Ciou, Pin Jui Chen, Elvin Y. Tseng, Yuh-Jye Lee

    Abstract: In the rapidly evolving realm of machine learning, algorithm effectiveness often faces limitations due to data quality and availability. Traditional approaches grapple with data sharing due to legal and privacy concerns. The federated learning framework addresses this challenge. Federated learning is a decentralized approach where model training occurs on client sides, preserving privacy by keepin… ▽ More

    Submitted 14 November, 2023; originally announced November 2023.

    Comments: 11 pages, 7 figures, 1 table. Accepted by IEEE BigData 2023, Sorrento, Italy

  22. arXiv:2309.12337  [pdf

    cs.CY cs.AI

    ActiveAI: Introducing AI Literacy for Middle School Learners with Goal-based Scenario Learning

    Authors: Ying Jui Tseng, Gautam Yadav

    Abstract: The ActiveAI project addresses key challenges in AI education for grades 7-9 students by providing an engaging AI literacy learning experience based on the AI4K12 knowledge framework. Utilizing learning science mechanisms such as goal-based scenarios, immediate feedback, project-based learning, and intelligent agents, the app incorporates a variety of learner inputs like sliders, steppers, and col… ▽ More

    Submitted 21 August, 2023; originally announced September 2023.

  23. arXiv:2309.10787  [pdf, other

    eess.AS cs.CV cs.MM cs.SD

    AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models

    Authors: Yuan Tseng, Layne Berry, Yi-Ting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Po-Yao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Shinji Watanabe, Abdelrahman Mohamed, Chi-Luen Feng, Hung-yi Lee

    Abstract: Audio-visual representation learning aims to develop systems with human-like perception by utilizing correlation between auditory and visual information. However, current models often focus on a limited set of tasks, and generalization abilities of learned representations are unclear. To this end, we propose the AV-SUPERB benchmark that enables general-purpose evaluation of unimodal audio/visual a… ▽ More

    Submitted 19 March, 2024; v1 submitted 19 September, 2023; originally announced September 2023.

    Comments: Accepted to ICASSP 2024; Evaluation Code: https://github.com/roger-tseng/av-superb Submission Platform: https://av.superbbenchmark.org

  24. arXiv:2308.04872  [pdf, other

    cs.CV

    Tracking Players in a Badminton Court by Two Cameras

    Authors: Young-Ching Chou, Shen-Ru Zhang, Bo-Wei Chen, Hong-Qi Chen, Cheng-Kuan Lin, Yu-Chee Tseng

    Abstract: This study proposes a simple method for multi-object tracking (MOT) of players in a badminton court. We leverage two off-the-shelf cameras, one on the top of the court and the other on the side of the court. The one on the top is to track players' trajectories, while the one on the side is to analyze the pixel features of players. By computing the correlations between adjacent frames and engaging… ▽ More

    Submitted 9 August, 2023; originally announced August 2023.

  25. arXiv:2307.10168  [pdf, other

    cs.CL cs.HC

    LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs

    Authors: Tongshuang Wu, Haiyi Zhu, Maya Albayrak, Alexis Axon, Amanda Bertsch, Wenxing Deng, Ziqi Ding, Bill Guo, Sireesh Gururaja, Tzu-Sheng Kuo, Jenny T. Liang, Ryan Liu, Ihita Mandal, Jeremiah Milbauer, Xiaolin Ni, Namrata Padmanabhan, Subhashini Ramkumar, Alexis Sudjianto, Jordan Taylor, Ying-Jui Tseng, Patricia Vaidos, Zhijin Wu, Wei Wu, Chenyang Yang

    Abstract: LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities. However, current efforts focus mainly on simple atomic tasks. We explore whether LLMs can replicate more complex crowdsourcing pipelines. We find that modern LLMs can simulate some of crowdworkers' abilities in these "human computation algorithms," but… ▽ More

    Submitted 19 July, 2023; v1 submitted 19 July, 2023; originally announced July 2023.

  26. arXiv:2307.09236  [pdf, ps, other

    math.NA

    A discontinuity and cusp capturing PINN for Stokes interface problems with discontinuous viscosity and singular forces

    Authors: Yu-Hau Tseng, Ming-Chih Lai

    Abstract: In this paper, we present a discontinuity and cusp capturing physics-informed neural network (PINN) to solve Stokes equations with a piecewise-constant viscosity and singular force along an interface. We first reformulate the governing equations in each fluid domain separately and replace the singular force effect with the traction balance equation between solutions in two sides along the interfac… ▽ More

    Submitted 10 September, 2023; v1 submitted 18 July, 2023; originally announced July 2023.

  27. arXiv:2306.00190  [pdf, other

    cs.HC

    Contextualizing Problems to Student Interests at Scale in Intelligent Tutoring System Using Large Language Models

    Authors: Gautam Yadav, Ying-Jui Tseng, Xiaolin Ni

    Abstract: Contextualizing problems to align with student interests can significantly improve learning outcomes. However, this task often presents scalability challenges due to resource and time constraints. Recent advancements in Large Language Models (LLMs) like GPT-4 offer potential solutions to these issues. This study explores the ability of GPT-4 in the contextualization of problems within CTAT, an int… ▽ More

    Submitted 31 May, 2023; originally announced June 2023.

  28. arXiv:2305.17855  [pdf, other

    cs.CL

    Vec2Gloss: definition modeling leveraging contextualized vectors with Wordnet gloss

    Authors: Yu-Hsiang Tseng, Mao-Chang Ku, Wei-Ling Chen, Yu-Lin Chang, Shu-Kai Hsieh

    Abstract: Contextualized embeddings are proven to be powerful tools in multiple NLP tasks. Nonetheless, challenges regarding their interpretability and capability to represent lexical semantics still remain. In this paper, we propose that the task of definition modeling, which aims to generate the human-readable definition of the word, provides a route to evaluate or understand the high dimensional semantic… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  29. arXiv:2305.17663  [pdf, other

    cs.CL

    Lexical Retrieval Hypothesis in Multimodal Context

    Authors: Po-Ya Angela Wang, Pin-Er Chen, Hsin-Yu Chou, Yu-Hsiang Tseng, Shu-Kai Hsieh

    Abstract: Multimodal corpora have become an essential language resource for language science and grounded natural language processing (NLP) systems due to the growing need to understand and interpret human communication across various channels. In this paper, we first present our efforts in building the first Multimodal Corpus for Languages in Taiwan (MultiMoco). Based on the corpus, we conduct a case study… ▽ More

    Submitted 28 May, 2023; originally announced May 2023.

  30. arXiv:2305.14616  [pdf, other

    cs.CL cs.CV

    Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis

    Authors: Pin-Er Chen, Po-Ya Angela Wang, Hsin-Yu Chou, Yu-Hsiang Tseng, Shu-Kai Hsieh

    Abstract: This paper explores the grounding issue regarding multimodal semantic representation from a computational cognitive-linguistic view. We annotate images from the Flickr30k dataset with five perceptual properties: Affordance, Perceptual Salience, Object Number, Gaze Cueing, and Ecological Niche Association (ENA), and examine their association with textual elements in the image captions. Our findings… ▽ More

    Submitted 24 October, 2023; v1 submitted 23 May, 2023; originally announced May 2023.

    Comments: 10 pages, 9 figures

  31. arXiv:2305.01863  [pdf

    cs.HC cs.AI cs.CL cs.SE

    GPTutor: a ChatGPT-powered programming tool for code explanation

    Authors: Eason Chen, Ray Huang, Han-Shin Chen, Yuen-Hsien Tseng, Liang-Yi Li

    Abstract: Learning new programming skills requires tailored guidance. With the emergence of advanced Natural Language Generation models like the ChatGPT API, there is now a possibility of creating a convenient and personalized tutoring system with AI for computer science education. This paper presents GPTutor, a ChatGPT-powered programming tool, which is a Visual Studio Code extension using the ChatGPT API… ▽ More

    Submitted 15 June, 2023; v1 submitted 2 May, 2023; originally announced May 2023.

    Comments: 6 pages. International Conference on Artificial Intelligence in Education 2023

  32. arXiv:2303.09279  [pdf, other

    cs.CR cs.MM

    Privacy-Preserving Video Conferencing via Thermal-Generative Images

    Authors: Sheng-Yang Chiu, Yu-Ting Huang, Chieh-Ting Lin, Yu-Chee Tseng, Jen-Jee Chen, Meng-Hsuan Tu, Bo-Chen Tung, YuJou Nieh

    Abstract: Due to the COVID-19 epidemic, video conferencing has evolved as a new paradigm of communication and teamwork. However, private and personal information can be easily leaked through cameras during video conferencing. This includes leakage of a person's appearance as well as the contents in the background. This paper proposes a novel way of using online low-resolution thermal images as conditions to… ▽ More

    Submitted 28 March, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted for publication at IEEE International Conference on Robotics and Automation (ICRA) 2023

  33. arXiv:2303.08809  [pdf, other

    cs.CL eess.AS

    Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences

    Authors: Yuan Tseng, Cheng-I Lai, Hung-yi Lee

    Abstract: Past work on unsupervised parsing is constrained to written form. In this paper, we present the first study on unsupervised spoken constituency parsing given unlabeled spoken sentences and unpaired textual data. The goal is to determine the spoken sentences' hierarchical syntactic structure in the form of constituency parse trees, such that each node is a span of audio that corresponds to a consti… ▽ More

    Submitted 9 May, 2023; v1 submitted 15 March, 2023; originally announced March 2023.

    Comments: Accepted to ICASSP 2023; updated compute resource acknowledgements

  34. Self-supervised learning-based general laboratory progress pretrained model for cardiovascular event detection

    Authors: Li-Chin Chen, Kuo-Hsuan Hung, Yi-Ju Tseng, Hsin-Yao Wang, Tse-Min Lu, Wei-Chieh Huang, Yu Tsao

    Abstract: The inherent nature of patient data poses several challenges. Prevalent cases amass substantial longitudinal data owing to their patient volume and consistent follow-ups, however, longitudinal laboratory data are renowned for their irregularity, temporality, absenteeism, and sparsity; In contrast, recruitment for rare or specific cases is often constrained due to their limited patient size and epi… ▽ More

    Submitted 7 September, 2023; v1 submitted 13 March, 2023; originally announced March 2023.

    Comments: published in IEEE Journal of Translational Engineering in Health & Medicine

    Journal ref: IEEE Journal of Translational Engineering in Health and Medicine, vol.12, p.43-56, 2023

  35. arXiv:2303.00439  [pdf, other

    cond-mat.dis-nn cond-mat.stat-mech hep-lat

    Detection of Berezinskii--Kosterlitz--Thouless transitions for the two-dimensional $q$-state clock models with neural networks

    Authors: Yaun-Heng Tseng, Fu-Jiun Jiang

    Abstract: Using the technique of supervised neural networks (NN), we study the phase transitions of two-dimensional (2D) 6- and 8-state clock models on the square lattice. The employed NN has only one input layer, one hidden layer of 2 neurons, and one output layer. In addition, the NN is trained without any prior information about the considered models. Interestingly, despite its simple architecture, the b… ▽ More

    Submitted 1 March, 2023; originally announced March 2023.

    Comments: 6 pages, 7 figures

  36. arXiv:2302.01457  [pdf

    cond-mat.str-el cond-mat.mtrl-sci

    Spin waves in a ferromagnetic topological metal

    Authors: Wenliang Zhang, Teguh Citra Asmara, Yi Tseng, Junbo Li, Yimin Xiong, Vladimir N. Strocov, Y. Soh, Thorsten Schmitt, Gabriel Aeppli

    Abstract: In most metals, charges and spins can hop rapidly between atoms, yielding strong dispersion of their energy versus momentum. There are, however, special arrangements of atoms, such as twisted graphene bilayers or lattices which resemble woven bamboo "kagome" mats, so that particle motion with strong hopping between neighbours becomes nearly or even completely dispersionless. Such flat bands are in… ▽ More

    Submitted 2 February, 2023; originally announced February 2023.

  37. arXiv:2301.01463  [pdf, ps, other

    cond-mat.soft physics.bio-ph

    Mechanosensitive bonds induced complex cell motility patterns

    Authors: Jen-Yu Lo, Yuan-Heng Tseng, Hsuan-Yi Chen

    Abstract: The one-dimensional crawling movement of a cell is considered in this theoretical study. Our active gel model shows that for a cell with weakly mechanosensitive adhesion complexes, as myosin contractility increases, a cell starts to move at a constant velocity. As the mechanosensitivity of the adhesion complexes increases, a cell can exhibit stick-slip motion. Finally, a cell with highly mechanose… ▽ More

    Submitted 4 January, 2023; originally announced January 2023.

  38. arXiv:2212.14655  [pdf, other

    hep-lat cond-mat.dis-nn

    Machine learning phases of an Abelian gauge theory

    Authors: Jhao-Hong Peng, Yuan-Heng Tseng, Fu-Jiun Jiang

    Abstract: The phase transition of the two-dimensional $U(1)$ quantum link model on the triangular lattice is investigated by employing a supervised neural network (NN) consisting of only one input layer, one hidden layer of two neurons, and one output layer. No information on the studied model is used when the NN training is conducted. Instead, two artificially made configurations are considered as the trai… ▽ More

    Submitted 30 December, 2022; originally announced December 2022.

    Comments: 7 pages, 12 figures

    Journal ref: Prog Theor Exp Phys (2023)

  39. arXiv:2212.10028  [pdf, other

    gr-qc astro-ph.HE hep-ph hep-th

    A novel test of gravity via black hole eikonal correspondence

    Authors: Che-Yu Chen, Yu-Jui Chen, Meng-Yuan Ho, Yung-Hsuan Tseng

    Abstract: When adopted in black hole spacetimes, geometric-optics approximations imply a mapping between the quasinormal mode (QNM) spectrum of black holes in the eikonal limit and black hole images. In particular, the real part and the imaginary part of eikonal QNM frequencies are associated with the apparent size and the detailed structure of the ring images, respectively. This correspondence could be vio… ▽ More

    Submitted 5 September, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: 9 pages, 2 figures. Matching published version

    Journal ref: Phys. Lett. B 845, 138153 (2023)

  40. arXiv:2211.06835  [pdf, other

    cs.CV cs.AI

    Scale-Aware Crowd Counting Using a Joint Likelihood Density Map and Synthetic Fusion Pyramid Network

    Authors: Yi-Kuan Hsieh, Jun-Wei Hsieh, Yu-Chee Tseng, Ming-Ching Chang, Bor-Shiun Wang

    Abstract: We develop a Synthetic Fusion Pyramid Network (SPF-Net) with a scale-aware loss function design for accurate crowd counting. Existing crowd-counting methods assume that the training annotation points were accurate and thus ignore the fact that noisy annotations can lead to large model-learning bias and counting error, especially for counting highly dense crowds that appear far away. To the best of… ▽ More

    Submitted 2 January, 2023; v1 submitted 13 November, 2022; originally announced November 2022.

    Comments: 8 pages, 8 figures, 4 tables

  41. arXiv:2211.06770  [pdf, other

    cs.CV cs.LG eess.IV

    MicroISP: Processing 32MP Photos on Mobile Devices with Deep Learning

    Authors: Andrey Ignatov, Anastasia Sycheva, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

    Abstract: While neural networks-based photo processing solutions can provide a better image quality compared to the traditional ISP systems, their application to mobile devices is still very limited due to their very high computational complexity. In this paper, we present a novel MicroISP model designed specifically for edge devices, taking into account their computational and memory limitations. The propo… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: arXiv admin note: text overlap with arXiv:2211.06263

  42. arXiv:2211.06263  [pdf, other

    cs.CV cs.LG eess.IV

    PyNet-V2 Mobile: Efficient On-Device Photo Processing With Neural Networks

    Authors: Andrey Ignatov, Grigory Malivenko, Radu Timofte, Yu Tseng, Yu-Syuan Xu, Po-Hsiang Yu, Cheng-Ming Chiang, Hsien-Kai Kuo, Min-Hung Chen, Chia-Ming Cheng, Luc Van Gool

    Abstract: The increased importance of mobile photography created a need for fast and performant RAW image processing pipelines capable of producing good visual results in spite of the mobile camera sensor limitations. While deep learning-based approaches can efficiently solve this problem, their computational requirements usually remain too large for high-resolution on-device image processing. To address th… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

  43. Conversion of Legal Agreements into Smart Legal Contracts using NLP

    Authors: Eason Chen, Niall Roche, Yuen-Hsien Tseng, Walter Hernandez, Jiangbo Shangguan, Alastair Moore

    Abstract: A Smart Legal Contract (SLC) is a specialized digital agreement comprising natural language and computable components. The Accord Project provides an open-source SLC framework containing three main modules: Cicero, Concerto, and Ergo. Currently, we need lawyers, programmers, and clients to work together with great effort to create a usable SLC using the Accord Project. This paper proposes a pipeli… ▽ More

    Submitted 5 April, 2023; v1 submitted 27 August, 2022; originally announced October 2022.

    Comments: 7 pages, Companion Proceedings of the ACM Web Conference 2023 (WWW '23 Companion), April 30-May 4, 2023, Austin, TX, USA

    MSC Class: 68T50 ACM Class: I.7

  44. A cusp-capturing PINN for elliptic interface problems

    Authors: Yu-Hau Tseng, Te-Sheng Lin, Wei-Fan Hu, Ming-Chih Lai

    Abstract: In this paper, we propose a cusp-capturing physics-informed neural network (PINN) to solve discontinuous-coefficient elliptic interface problems whose solution is continuous but has discontinuous first derivatives on the interface. To find such a solution using neural network representation, we introduce a cusp-enforced level set function as an additional feature input to the network to retain the… ▽ More

    Submitted 16 April, 2023; v1 submitted 15 October, 2022; originally announced October 2022.

  45. arXiv:2210.07185  [pdf, other

    cs.CL eess.AS

    On the Utility of Self-supervised Models for Prosody-related Tasks

    Authors: Guan-Ting Lin, Chi-Luen Feng, Wei-Ping Huang, Yuan Tseng, Tzu-Han Lin, Chen-An Li, Hung-yi Lee, Nigel G. Ward

    Abstract: Self-Supervised Learning (SSL) from speech data has produced models that have achieved remarkable performance in many tasks, and that are known to implicitly represent many aspects of information latently present in speech signals. However, relatively little is known about the suitability of such models for prosody-related tasks or the extent to which they encode prosodic information. We present a… ▽ More

    Submitted 26 October, 2022; v1 submitted 13 October, 2022; originally announced October 2022.

    Comments: Accepted to IEEE SLT 2022

  46. An efficient neural-network and finite-difference hybrid method for elliptic interface problems with applications

    Authors: Wei-Fan Hu, Te-Sheng Lin, Yu-Hau Tseng, Ming-Chih Lai

    Abstract: A new and efficient neural-network and finite-difference hybrid method is developed for solving Poisson equation in a regular domain with jump discontinuities on embedded irregular interfaces. Since the solution has low regularity across the interface, when applying finite difference discretization to this problem, an additional treatment accounting for the jump discontinuities must be employed. H… ▽ More

    Submitted 2 March, 2023; v1 submitted 11 October, 2022; originally announced October 2022.

    Journal ref: Commun. Comput. Phys., Vol. 33, pp.1090-1105 (2023)

  47. arXiv:2210.04400  [pdf

    cs.HC cs.AI

    Focus Plus: Detect Learner's Distraction by Web Camera in Distance Teaching

    Authors: Eason Chen, Yuen Hsien Tseng, Kuo-Ping Lo

    Abstract: Distance teaching has become popular these years because of the COVID-19 epidemic. However, both students and teachers face several challenges in distance teaching, like being easy to distract. We proposed Focus+, a system designed to detect learners' status with the latest AI technology from their web camera to solve such challenges. By doing so, teachers can know students' status, and students c… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: 5 Pages, 4 Figures, 2021 National Chair Professorship Academic Series: Teaching and Learning in Pandemic Era

  48. arXiv:2209.13274  [pdf, other

    cs.RO cs.CV

    Orbeez-SLAM: A Real-time Monocular Visual SLAM with ORB Features and NeRF-realized Mapping

    Authors: Chi-Ming Chung, Yang-Che Tseng, Ya-Ching Hsu, Xiang-Qian Shi, Yun-Hung Hua, Jia-Fong Yeh, Wen-Chin Chen, Yi-Ting Chen, Winston H. Hsu

    Abstract: A spatial AI that can perform complex tasks through visual signals and cooperate with humans is highly anticipated. To achieve this, we need a visual SLAM that easily adapts to new scenes without pre-training and generates dense maps for downstream tasks in real-time. None of the previous learning-based and non-learning-based visual SLAMs satisfy all needs due to the intrinsic limitations of their… ▽ More

    Submitted 31 January, 2023; v1 submitted 27 September, 2022; originally announced September 2022.

  49. arXiv:2209.01891  [pdf, other

    cs.NI

    A Survey on Open-Source-Defined Wireless Networks: Framework, Key Technology, and Implementation

    Authors: Liqiang Zhao, Muhammad Muhammad Bala, Wu Gang, Pan Chengkang, Yuan Yannan, Tian Zhigang, Yu-Chee Tseng, Chen Xiang, Bin Shen, Chih-Lin I

    Abstract: The realization of open-source-defined wireless networks in the telecommunication domain is accomplished through the fifth-generation network (5G). In contrast to its predecessors (3G and 4G), the 5G network can support a wide variety of heterogeneous use cases with challenging requirements from both the Internet and the Internet of Things (IoT). The future sixth-generation (6G) network will not o… ▽ More

    Submitted 5 September, 2022; originally announced September 2022.

  50. arXiv:2207.11810  [pdf, other

    cs.CV

    VizWiz-FewShot: Locating Objects in Images Taken by People With Visual Impairments

    Authors: Yu-Yun Tseng, Alexander Bell, Danna Gurari

    Abstract: We introduce a few-shot localization dataset originating from photographers who authentically were trying to learn about the visual content in the images they took. It includes nearly 10,000 segmentations of 100 categories in over 4,500 images that were taken by people with visual impairments. Compared to existing few-shot object detection and instance segmentation datasets, our dataset is the fir… ▽ More

    Submitted 24 July, 2022; originally announced July 2022.

    Comments: Accepted to ECCV 2022. The first two authors contributed equally