


default search action
26th ISM 2024: Tokyo, Japan
- IEEE International Symposium on Multimedia, ISM 2024, Tokyo, Japan, December 11-13, 2024. IEEE 2024, ISBN 979-8-3315-1111-1

- Subhadra Gopalakrishnan, Trisha Mittal, Jaclyn Pytlarz

, Yuheng Zhao:
S2MGen: A synthetic skin mask generator for improving segmentation. 1-8 - Yi-Chieh Wu, Yu-Jung Hsu:

Generating and Evaluating Cursive Chinese Calligraphy by Semi-Classifying Style: A Case Study Using a Diffusion Model. 9-16 - Yassine Belkhouche, AlaaIdin Dwaik:

StegoFusion-Net: Fusion of Convolutional Neural Networks for Spatial Image Steganalysis. 17-23 - Hisayoshi Kaneda, Ryota Kawamata, Kazuyoshi Yamazaki, Kazuya Shimizu:

Disparity Correction Method of the Monocular Omnidirectional Stereo Camera. 24-25 - Wen-Hung Liao

, Po-Han Chen, Yi-Chieh Wu:
Unveiling the Potential of SSL-Generated Audio Embeddings for Cross-Lingual Speaker Recognition. 26-32 - Di Hu, Katunobu Ito:

Two-stage instrument timbre transfer method using RAVE. 33-40 - Aoi Ito, Katunobu Itou:

Speaker Pseudonymization for Japanese Speech Using Duration Embeddings. 41-48 - Duc V. Nguyen, Quang Long Nguyen

, Tran Thuy Hien, Nguyen Ngoc Huyen, Truong Thu Huong, Pham Ngoc Nam:
Modeling User Quality of Experience in Adaptive Point Cloud Video Streaming. 49-54 - Steve Göring, Rasmus Merten, Alexander Raake:

Appeal prediction for AI up-scaled Images. 55-62 - Tailai Song, Paolo Garza, Michela Meo, Maurizio Matteo Munafò:

Modelling Concurrent RTP Flows for End-to-end Predictions of QoS in Real Time Communications. 63-70 - Sushant Gautam, Mehdi Houshmand Sarkhoosh, Jan Held, Cise Midoglu, Anthony Cioppa, Silvio Giancola, Vajira Thambawita, Michael A. Riegler, Pål Halvorsen, Mubarak Shah:

SoccerNet-Echoes: A Soccer Game Audio Commentary Dataset. 71-78 - Peter O. Fasogbon:

Ensuring Color Consistency in RGB-D Multi-Camera Setup. 79-84 - Ahmadreza Sezavar, Catarina Brites, João Ascenso:

Low Complexity Learning-based Lossless Event-based Compression. 85-92 - Håkon Maric Solberg, Mehdi Houshmand Sarkhoosh, Sushant Gautam, Saeed Shafiee Sabet, Pål Halvorsen, Cise Midoglu:

PlayerTV: Advanced Player Tracking and Identification for Automatic Soccer Highlight Clips. 93-97 - Wei Zhang, Victor Soares Bursztyn:

Flexible And Faithful Data Insights Generation. 98-105 - Shuntaro Masuda, Toshihiko Yamasaki:

Holistic Visualization of Contextual Knowledge in Hotel Customer Reviews Using Self-Attention. 106-109 - Wen-Hung Liao

, Yang-Jing Lin:
Investigation of Feature Distribution and Network Weight Updates in the Machine Unlearning Process. 110-113 - Greeshma Sree Parimi, Gurkirat Singh Guliani, Min Chen:

Platform for Endangered Language Education. 114-115 - Clément Saint-Marc, Katunobu Itou:

Homophonic Music Composition Using a GAN and LSTM Pipeline for Melody and Harmony Generation. 116-119 - Yuhuan Wang, Katunobu Itou:

Instrumentality Classification Evaluation System for Natural Sounds*. 120-123 - Tomoo Kouzai, Junya Koguchi, Tetsuro Kitahara:

Generating Bass Phrases from Guitar Chord Backing with NMF. 124-125 - Jakub Kovác, Wolfgang Hürst:

Watch your back! Dynamic thumbnails for a 360-degree video player to enhance viewing experience on 2D displays. 126-132 - Daichi Arai, Yuichi Kondo

, Yasuko Sugito, Yuichi Kusakabe:
Influence of Display Devices and Field of View on Subjective Quality of Experience Evaluation of 8K 360° Videos. 133-136 - Serkan Sulun

, Paula Viana, Matthew E. P. Davies:
VEMOCLAP: A video emotion classification web application. 137-140 - Zhikai Liu

, Kun Zhang, Xin-Yi Cui, Wei Sun, Fan Liang:
A Power-Law Transformation Approach for Template-Based Cross-Component Prediction. 141-142 - Dominik Keller, Paul Rudi Frank, Steve Göring, Alexander Raake:

Investigating the Impact of High Frame Rate on Video Quality: A SAMVIQ Approach. 143-144 - Tran Gia Minh, Truong Thu Huong, Duc V. Nguyen:

A Server-driven View-aware Point Cloud Video Streaming Framework. 145-148 - Pedro Martin

, António Rodrigues, João Ascenso, Maria Paula Queluz:
Evaluation of strategies for efficient rate-distortion NeRF streaming. 149-153 - Yumeka Chujo, Yusuke Tagashira, Yukiko Harada, Kenji Kanai, Jiro Katto:

Perceptual Quality Driven Point Cloud Compression for 6DoF 3D Point Cloud Streaming. 154-157 - Yuriy A. Reznik

, Guillem Cabrera:
On Multi-CDN Delivery Costs Optimization Problem. 158-161 - Geerthan Srikantharajah, Naimul Khan:

Sliding Window Check: Repairing Object Identities. 162-169 - Genta Matsukawa, Atsuo Yoshitaka:

Data Augmentation with Diffusion Model for Hand Detection. 170-173 - Keita Yamane, Akira Kitayama, Keigo Hasegawa, Yusuke Obonai, Hiroto Sasao:

AI Maintenance Techniques by Detecting Performance Degradation in Domain Shift Using Model Ensembles. 174-175 - Raphael Waltenspül, Florian Spiess, Heiko Schuldt:

Cross-Modal 3D Model Retrieval. 176-180 - Takumi Komori, Takahiro Hayashi:

Prevention of Unexpected Object Generation in Diffusion Model-Based Inpainting. 181-184 - Maria Tzelepi, Vasileios Mezaris:

LMM-Regularized CLIP Embeddings for Image Classification. 185-188 - Kolja Kieslich, Louay Bassbouss, Stephan Steglich, Stefan Arbanowski:

Evaluation Framework for Novel View Synthesis. 189-192 - Jussif J. Abularach Arnez

, Cassio A. Tavares Alves, Wederson Medeiros Silva, Isaac Barros Gomes, Carla Lapa Nogueira, Maria G. Lima Damasceno
:
A Simulation for the Evaluation of the Mean Opinion Score (MOS) for EVS-WB and AMR-WB Audio Codecs for 5G Mobile Networks. 193-196 - John Li, Deepak Nair, Klara Nahrstedt, Indranil Gupta, Shehab Sarar Ahmed:

FrameCorr: Adaptive, Autoencoder-based Neural Compression for Video Reconstruction in Resource and Timing Constrained Network Settings. 197-200 - Yasuhiro Mochida, Takuro Yamaguchi, Hirokazu Takahashi, Koichi Takasugi:

Ultra-low-latency 8K120p-video-transmission System Parallelizing SMPTE ST 2110. 201-202 - Takuro Yamaguchi, Yasuhiro Mochida, Hirokazu Takahashi:

Low-latency Software-based Uncompressed Video Transmission. 203-204 - Pengcheng Zeng, Atsuo Yoshitaka:

Visual Speech Recognition with Surrounding and Emotional Information. 205-212 - John O. Murray

, Michael Zink:
Synchronized Object Sharing for Augmented Reality Virtual Conferencing. 213-218 - Viviana Crescitelli

, Takashi Oshima:
Fusion-Based Human Pose Estimation Using RGB and IR Images with Transformer-Based Decoding. 219-220 - Kin Ching Lydia Chau, Zhi Yu, Ruowei Jiang:

Occlusion-Aware Real-Time Tiny Facial Alignment Model for Makeup Virtual Try-On. 221-224 - Nan Bu

, Kakeru Nakano:
A Study on Mental Stress Test using Cybersickness caused by Virtual Reality Contents. 225-226 - Jana Motowilowa, Maurizio Vergari, Tanja Kojic, Maximilian Warsinke

, Sebastian Möller, Jan-Niklas Voigt-Antons:
Exploring Augmented Table Setup and Lighting Customization in a Simulated Restaurant to Improve the User Experience. 227-231 - Pedro Baptista de Castro, Hiroko Sukeda, Soichi Takashige:

Human-in-the-loop knowledge base upkeep for retrieval augmented generation applications. 232-233 - Hannes Fassold:

LiveSkeleton: High-Quality Real-Time Human Tracking and Pose Estimation. 234-235 - Florian Schimanke, Robert Mertens, Felix Prankel:

A technical Concept for enhancing the Student Experience in Hybrid Lecture Scenarios. 236-241 - Ryota Kishimoto, Shuhei Tsuchida, Tsutomu Terada, Masahiko Tsukamoto:

SpotiView: Partial Face Display Method for Smooth Communication While Protecting Privacy. 242-249 - Rajini Chittimalla, Sujung Choi, Madhu Sai Vineel Reka, Yassine Belkhouche:

Characterizing students behavior in multi-user multi-computer testing environments. 250-254 - Alexander Gantikow, Andreas Isking, Wolfgang Müller, Paul Libbrecht, Sandra Rebholz:

Evaluating Interactive Concept Maps Produced from E-Portfolios. 255-260 - Gabriel Valerio-Ureña, Giomara Sevilla-Campoverde, Soledad Ortúzar, Christian Lazcano

:
Gender Stereotypes in the Creation of Educational Cases with ChatGPT. 261-266 - Karam Dawoud, Birgit Nierula, Farelle Toumaleu Siewe, Thomas Koch, Daniel Johannes Meyer, Andreas Bock, Marianne Heinze, Daniela Knuth, Denis Martin, Julia Schander, Anna Hilsmann, Peter Eisert, Sebastian Bosse:

Multi-View Gesture Recognition in Conflict Situations. 267-268 - Mario Wolf, Sebastian Hartwig, Gregor Steinhöfel, Heinrich Söbke, Eckhard Kraft:

PanoramaViewer - A Framework for Educational Collaborative Virtual Field Trips. 269-274 - Yusuke Maeda, Takahiro Hayashi:

Real-time Multi-modal Highlight Prediction for Simultaneous Viewing of Multiple Live Streams. 275-278 - Itsuki Sano, Yuanyuan Wang, Yukiko Kawai, Kazutoshi Sumiya:

Slide Analysis Method for Editing Lecture Materials based on Hierarchical Structures of Subject Terminologies. 279-284 - Boris Ruf, Marcin Detyniecki:

The ≪Huh?≫ Button: Improving Understanding in Educational Videos with Large Language Models. 285-289

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














