default search action
Takuma Okamoto
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
Journal Articles
- 2024
- [j12]Haruki Yamashita, Takuma Okamoto, Ryoichi Takashima, Yamato Ohtani, Tetsuya Takiguchi, Tomoki Toda, Hisashi Kawai:
Fast Neural Speech Waveform Generative Models With Fully-Connected Layer-Based Upsampling. IEEE Access 12: 31409-31421 (2024) - 2023
- [j11]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Hisashi Kawai:
Harmonic-Net: Fundamental Frequency and Speech Rate Controllable Fast Neural Vocoder. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1902-1915 (2023) - 2022
- [j10]Ryota Komatsu, Shengzhou Gao, Wenxin Hou, Mingxin Zhang, Tomohiro Tanaka, Keisuke Toyoda, Yusuke Kimura, Kent Hino, Yu Iwamoto, Kosuke Mori, Takuma Okamoto, Takahiro Shinozaki:
Automatic Spoken Language Acquisition Based on Observation and Dialogue. IEEE J. Sel. Top. Signal Process. 16(6): 1480-1492 (2022) - [j9]Takuma Okamoto, Keisuke Matsubara, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Neural speech-rate conversion with multispeaker WaveNet vocoder. Speech Commun. 138: 1-12 (2022) - 2021
- [j8]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Full-Band LPCNet: A Real-Time Neural Vocoder for 48 kHz Audio With a CPU. IEEE Access 9: 94923-94933 (2021) - [j7]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-Autoregressive Raw Waveform Generative Model With Pitch-Dependent Dilated Convolution Neural Network. IEEE ACM Trans. Audio Speech Lang. Process. 29: 792-806 (2021) - 2017
- [j6]Takuma Okamoto:
Horizontal Local Sound Field Propagation Based on Sound Source Dimension Mismatch. J. Inf. Hiding Multim. Signal Process. 8(5): 1069-1081 (2017) - [j5]Takuma Okamoto:
Localized Sound Zone Generation Based on External Radiation Canceller. J. Inf. Hiding Multim. Signal Process. 8(6): 1335-1351 (2017) - [j4]Shigeru Toyama, Yasuhiro Tanaka, Satoshi Shirogane, Takashi Nakamura, Tokio Umino, Ryo Uehara, Takuma Okamoto, Hiroshi Igarashi:
Development of Wearable Sheet-Type Shear Force Sensor and Measurement System that is Insusceptible to Temperature and Pressure. Sensors 17(8): 1752 (2017) - 2015
- [j3]Jorge Treviño, Yôiti Suzuki, Takuma Okamoto, Yukio Iwaya, Junfeng Li:
A Spatial Extrapolation Method to Derive High-Order Ambisonics Data from Stereo Sources. J. Inf. Hiding Multim. Signal Process. 6(6): 1100-1116 (2015) - 2014
- [j2]Jorge Treviño, Takuma Okamoto, Yukio Iwaya, Yôiti Suzuki:
Sound Field Reproduction Using Ambisonics and Irregular Loudspeaker Arrays. IEICE Trans. Fundam. Electron. Commun. Comput. Sci. 97-A(9): 1832-1839 (2014) - 2010
- [j1]Hoseok Wey, Akinori Ito, Takuma Okamoto, Yôiti Suzuki:
Multiple Description Coding Using Time Domain Division for MP3 coded Sound Signal. J. Inf. Hiding Multim. Signal Process. 1(4): 269-285 (2010)
Conference and Workshop Papers
- 2024
- [c30]Yamato Ohtani, Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter. ICASSP 2024: 10871-10875 - [c29]Takuma Okamoto, Yamato Ohtani, Tomoki Toda, Hisashi Kawai:
Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice Conversion. ICASSP 2024: 12456-12460 - 2023
- [c28]Takuma Okamoto, Haruki Yamashita, Yamato Ohtani, Tomoki Toda, Hisashi Kawai:
WaveNeXt: ConvNeXt-Based Fast Neural Vocoder Without ISTFT layer. ASRU 2023: 1-8 - [c27]Ryota Komatsu, Yusuke Kimura, Takuma Okamoto, Takahiro Shinozaki:
Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder. ICASSP 2023: 1-5 - [c26]Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
E2E-S2S-VC: End-To-End Sequence-To-Sequence Voice Conversion. INTERSPEECH 2023: 2043-2047 - 2021
- [c25]Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
Multi-Stream HiFi-GAN with Data-Driven Waveform Decomposition. ASRU 2021: 610-617 - [c24]Takuma Okamoto:
Close-Talking Recording with Planarly Distributed Microphones. ICASSP 2021: 4470-4474 - [c23]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Noise Level Limited Sub-Modeling for Diffusion Probabilistic Vocoders. ICASSP 2021: 6029-6033 - [c22]Keisuke Matsubara, Takuma Okamoto, Ryoichi Takashima, Tetsuya Takiguchi, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
High-Intelligibility Speech Synthesis for Dysarthric Speakers with LPCNet-Based TTS and CycleVAE-Based VC. ICASSP 2021: 7058-7062 - [c21]Takuma Okamoto:
2D Multizone Sound Field Synthesis with Interior-Exterior Ambisonics. WASPAA 2021: 276-280 - 2020
- [c20]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Transformer-Based Text-to-Speech with Weighted Forced Attention. ICASSP 2020: 6729-6733 - [c19]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-Autoregressive Pitch-Dependent Dilated Convolution Model for Parametric Speech Generation. INTERSPEECH 2020: 3535-3539 - 2019
- [c18]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Tacotron-Based Acoustic Model Using Phoneme Alignment for Practical Neural Text-to-Speech Systems. ASRU 2019: 214-221 - [c17]Takuma Okamoto:
Horizontal 3D Sound Field Recording and 2.5D Synthesis with Omni-directional Circular Arrays. ICASSP 2019: 960-964 - [c16]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Investigations of Real-time Gaussian Fftnet and Parallel Wavenet Neural Vocoders with Simple Acoustic Features. ICASSP 2019: 7020-7024 - [c15]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Real-Time Neural Text-to-Speech with Sequence-to-Sequence Acoustic Model and WaveGlow or Single Gaussian WaveRNN Vocoders. INTERSPEECH 2019: 1308-1312 - [c14]Takuma Okamoto:
3D Localized Sound Zone Generation with a Planar Omni-Directional Loudspeaker Array. WASPAA 2019: 110-114 - 2018
- [c13]Takuma Okamoto, Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
An Investigation of Subband Wavenet Vocoder Covering Entire Audible Frequency Range with Limited Acoustic Features. ICASSP 2018: 5654-5658 - [c12]Takuma Okamoto:
2.5D Localized Sound Zone Generation with a Circular Array of Fixed-Directivity Loudspeakers. IWAENC 2018: 321-325 - [c11]Takuma Okamoto, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Improving FFTNet Vocoder with Noise Shaping and Subband Approaches. SLT 2018: 304-311 - 2017
- [c10]Takuma Okamoto, Kentaro Tachibana, Tomoki Toda, Yoshinori Shiga, Hisashi Kawai:
Subband wavenet with overlapped single-sideband filterbanks. ASRU 2017: 698-704 - [c9]Takuma Okamoto:
Analytical approach to 2.5D sound field control using a circular double-layer array of fixed-directivity loudspeakers. ICASSP 2017: 91-95 - [c8]Takuma Okamoto:
Angular spectrum decomposition-based 2.5D higher-order spherical harmonic sound field synthesis with a linear loudspeaker array. WASPAA 2017: 180-184 - 2016
- [c7]Takuma Okamoto:
2.5D higher order ambisonics for a sound field described by angular spectrum coefficients. ICASSP 2016: 326-330 - 2015
- [c6]Takuma Okamoto:
Near-field sound propagation based on a circular and linear array combination. ICASSP 2015: 624-628 - [c5]Takuma Okamoto:
Analytical methods of generating multiple sound zones for open and baffled circular loudspeaker arrays. WASPAA 2015: 1-5 - 2014
- [c4]Takuma Okamoto:
Generation of multiple sound zones by spatial filtering in wavenumber domain using a linear array of loudspeakers. ICASSP 2014: 4733-4737 - 2013
- [c3]Akitoshi Kawamura, Takuma Okamoto, Yuichi Tatsu, Yushi Uno, Masahide Yamato:
Morpion Solitaire 5D: a new upper bound 121 on the maximum score. CCCG 2013 - [c2]Jorge Treviño, Takuma Okamoto, Yukio Iwaya, Junfeng Li, Yôiti Suzuki:
Extrapolation of Horizontal Ambisonics Data from Mainstream Stereo Sources. IIH-MSP 2013: 302-305 - 2010
- [c1]Toshiyuki Kimura, Yoko Yamakata, Michiaki Katsumoto, Takuma Okamoto, Satoshi Yairi, Yukio Iwaya, Yôiti Suzuki:
Comparative performance evaluation of near 3D sound field reproduction system with directional loudspeakers and wave field synthesis. IUCS 2010: 221-228
Parts in Books or Collections
- 2020
- [p1]Yoshinori Shiga, Jinfu Ni, Kentaro Tachibana, Takuma Okamoto:
Text-to-Speech Synthesis. Speech-to-Speech Translation 2020: 39-52
Informal and Other Publications
- 2022
- [i4]Detai Xin, Shinnosuke Takamichi, Takuma Okamoto, Hisashi Kawai, Hiroshi Saruwatari:
Speaking-Rate-Controllable HiFi-GAN Using Feature Interpolation. CoRR abs/2204.10561 (2022) - 2020
- [i3]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN Vocoder: A Non-autoregressive Pitch-dependent Dilated Convolution Model for Parametric Speech Generation. CoRR abs/2005.08654 (2020) - [i2]Yi-Chiao Wu, Tomoki Hayashi, Takuma Okamoto, Hisashi Kawai, Tomoki Toda:
Quasi-Periodic Parallel WaveGAN: A Non-autoregressive Raw Waveform Generative Model with Pitch-dependent Dilated Convolution Neural Network. CoRR abs/2007.12955 (2020) - 2013
- [i1]Akitoshi Kawamura, Takuma Okamoto, Yuichi Tatsu, Yushi Uno, Masahide Yamato:
Morpion Solitaire 5D: a new upper bound of 121 on the maximum score. CoRR abs/1307.8192 (2013)
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-04 21:00 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint