default search action
Shalabh Bhatnagar
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [j83]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
Truncated Cauchy random perturbations for smoothed functional-based stochastic optimization. Autom. 162: 111528 (2024) - [j82]Arghyadeep Barat, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Management in a Cooperative Energy Harvesting Wireless Sensor Network. IEEE Commun. Lett. 28(1): 243-247 (2024) - [j81]Lakshmi Mandal, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Variance-Reduced Deep Actor-Critic With an Optimally Subsampled Actor Recursion. IEEE Trans. Artif. Intell. 5(7): 3607-3623 (2024) - [c85]Mizhaan Prajit Maniyar, Prashanth L. A., Akash Mondal, Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. AISTATS 2024: 4708-4716 - [c84]V. P. Vivek, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Dynamic Energy Management in Competing Microgrids using Reinforcement Learning. ISGT 2024: 1-5 - [i85]Prashansa Panda, Shalabh Bhatnagar:
Critic-Actor for Average Reward MDPs with Function Approximation: A Finite-Time Analysis. CoRR abs/2402.01371 (2024) - [i84]Joji Joseph, Bharadwaj Amrutur, Shalabh Bhatnagar:
Gradient-Driven 3D Segmentation and Affordance Transfer in Gaussian Splatting Using 2D Masks. CoRR abs/2409.11681 (2024) - 2023
- [j80]Shalabh Bhatnagar, Vivek S. Borkar, Soumyajit Guin:
Actor-Critic or Critic-Actor? A Tale of Two Time Scales. IEEE Control. Syst. Lett. 7: 2671-2676 (2023) - [c83]Soumyajit Guin, Shalabh Bhatnagar:
A Policy Gradient Approach for Finite Horizon Constrained Markov Decision Processes. CDC 2023: 3353-3359 - [c82]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CISS 2023: 1-6 - [c81]Naman Saxena, Subhojyoti Khastagir, Shishir Kolathaya, Shalabh Bhatnagar:
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search. ICML 2023: 30130-30203 - [c80]Sambhu H. Karumanchi, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
Autonomous UAV Navigation in Complex Environments using Human Feedback. RO-MAN 2023: 499-506 - [i83]Lakshmi Mandal, Shalabh Bhatnagar:
n-Step Temporal Difference Learning with Optimal n. CoRR abs/2303.07068 (2023) - [i82]Mizhaan Prajit Maniyar, Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Cubic-regularized Policy Newton Algorithm for Reinforcement Learning. CoRR abs/2304.10951 (2023) - [i81]Arunselvan Ramaswamy, Shalabh Bhatnagar, Naman Saxena:
A Framework for Provably Stable and Consistent Training of Deep Feedforward Networks. CoRR abs/2305.12125 (2023) - [i80]Naman Saxena, Subhojyoti Khastagir, Shishir Kolathaya, Shalabh Bhatnagar:
Off-Policy Average Reward Actor-Critic with Deterministic Policy Search. CoRR abs/2305.12239 (2023) - [i79]Shalabh Bhatnagar:
The Reinforce Policy Gradient Algorithm Revisited. CoRR abs/2310.05000 (2023) - [i78]Arghyadeep Barat, Prabuchandran K. J., Shalabh Bhatnagar:
Energy Management in a Cooperative Energy Harvesting Wireless Sensor Network. CoRR abs/2310.05911 (2023) - [i77]Prashansa Panda, Shalabh Bhatnagar:
Finite Time Analysis of Constrained Actor Critic and Constrained Natural Actor Critic Algorithms. CoRR abs/2310.16363 (2023) - [i76]Lakshmi Mandal, Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
Approximate Linear Programming and Decentralized Policy Improvement in Cooperative Multi-agent Markov Decision Processes. CoRR abs/2311.11789 (2023) - 2022
- [j79]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Analyzing Approximate Value Iteration Algorithms. Math. Oper. Res. 47(3): 2138-2159 (2022) - [j78]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Generalized Second-Order Value Iteration in Markov Decision Processes. IEEE Trans. Autom. Control. 67(8): 4241-4247 (2022) - [j77]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Generalized Minimax Q-Learning Algorithm for Two-Player Zero-Sum Stochastic Games. IEEE Trans. Autom. Control. 67(9): 4816-4823 (2022) - [c79]Rohan Deb, Shalabh Bhatnagar:
Gradient Temporal Difference with Momentum: Stability and Convergence. AAAI 2022: 6488-6496 - [c78]Rohan Deb, Meet Gandhi, Shalabh Bhatnagar:
Schedule Based Temporal Difference Algorithms. Allerton 2022: 1-6 - [c77]Priya Shanmugasundaram, Shalabh Bhatnagar:
Co-operative Multi-agent Twin Delayed DDPG for Robust Phase Duration Optimization of Large Road Networks. ICAART (Revised Selected Paper 2022: 122-142 - [c76]Priya Shanmugasundaram, Shalabh Bhatnagar:
Robust Traffic Signal Timing Control using Multiagent Twin Delayed Deep Deterministic Policy Gradients. ICAART (2) 2022: 477-485 - [c75]Utkarsh A. Mishra, Soumya R. Samineni, Prakhar Goel, Chandravaran Kunjeti, Himanshu Lodha, Aman Singh, Aditya Sagi, Shalabh Bhatnagar, Shishir Kolathaya:
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning. ICRA 2022: 1631-1637 - [c74]Raghuram Bharadwaj Diddigi, Prateek Jain, Prabuchandran K. J., Shalabh Bhatnagar:
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm. IJCNN 2022: 1-10 - [c73]Ashish Kumar Jayant, Shalabh Bhatnagar:
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm. NeurIPS 2022 - [c72]Sindhu Padakandla, Prabuchandran K. J., Sourav Ganguly, Shalabh Bhatnagar:
Data Efficient Safe Reinforcement Learning. SMC 2022: 1167-1172 - [i75]Arun Raman, Keerthan Shagrithaya, Shalabh Bhatnagar:
Reinforcement Learning for Task Specifications with Action-Constraints. CoRR abs/2201.00286 (2022) - [i74]Akash Mondal, Prashanth L. A., Shalabh Bhatnagar:
A Gradient Smoothed Functional Algorithm with Truncated Cauchy Random Perturbations for Stochastic Optimization. CoRR abs/2208.00290 (2022) - [i73]Shalabh Bhatnagar, Vivek S. Borkar, Soumyajit Guin:
Actor-Critic or Critic-Actor? A Tale of Two Time Scales. CoRR abs/2210.04470 (2022) - [i72]Soumyajit Guin, Shalabh Bhatnagar:
A policy gradient approach for Finite Horizon Constrained Markov Decision Processes. CoRR abs/2210.04527 (2022) - [i71]Ashish Kumar Jayant, Shalabh Bhatnagar:
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm. CoRR abs/2210.07573 (2022) - [i70]Shalabh Bhatnagar, Prashanth L. A.:
Generalized Simultaneous Perturbation Stochastic Approximation with Reduced Estimator Bias. CoRR abs/2212.10477 (2022) - 2021
- [j76]Prabuchandran K. J., Santosh Penubothula, Chandramouli Kamanchi, Shalabh Bhatnagar:
Novel First Order Bayesian Optimization with an Application to Reinforcement Learning. Appl. Intell. 51(3): 1565-1579 (2021) - [j75]Prasenjit Karmakar, Shalabh Bhatnagar:
On tight bounds for function approximation error in risk-sensitive reinforcement learning. Syst. Control. Lett. 150: 104899 (2021) - [j74]Arunselvan Ramaswamy, Shalabh Bhatnagar, Daniel E. Quevedo:
Asynchronous Stochastic Approximations With Asymptotically Biased Errors and Deep Multiagent Learning. IEEE Trans. Autom. Control. 66(9): 3969-3983 (2021) - [j73]Prasenjit Karmakar, Shalabh Bhatnagar:
Stochastic Approximation With Iterate-Dependent Markov Noise Under Verifiable Conditions in Compact State Space With the Stability of Iterates Not Ensured. IEEE Trans. Autom. Control. 66(12): 5941-5954 (2021) - [j72]Abhik Singla, Sindhu Padakandla, Shalabh Bhatnagar:
Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV With Limited Environment Knowledge. IEEE Trans. Intell. Transp. Syst. 22(1): 107-118 (2021) - [c71]P. Parnika, Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Attention Actor-Critic Algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning. AAMAS 2021: 1616-1618 - [i69]P. Parnika, Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Attention Actor-Critic algorithm for Multi-Agent Constrained Co-operative Reinforcement Learning. CoRR abs/2101.02349 (2021) - [i68]Raghuram Bharadwaj Diddigi, Prateek Jain, Prabuchandran K. J., Shalabh Bhatnagar:
Neural Network Compatible Off-Policy Natural Actor-Critic Algorithm. CoRR abs/2110.10017 (2021) - [i67]Vivek VP, Shalabh Bhatnagar:
Finite Horizon Q-learning: Stability, Convergence and Simulations. CoRR abs/2110.15093 (2021) - [i66]Rohan Deb, Shalabh Bhatnagar:
Gradient Temporal Difference with Momentum: Stability and Convergence. CoRR abs/2111.11004 (2021) - [i65]Rohan Deb, Meet Gandhi, Shalabh Bhatnagar:
Schedule Based Temporal Difference Algorithms. CoRR abs/2111.11768 (2021) - [i64]Utkarsh A. Mishra, Soumya R. Samineni, Prakhar Goel, Chandravaran Kunjeti, Himanshu Lodha, Aman Singh, Aditya Sagi, Shalabh Bhatnagar, Shishir Kolathaya:
Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning. CoRR abs/2112.02999 (2021) - [i63]Rohan Deb, Shalabh Bhatnagar:
N-Timescale Stochastic Approximation: Stability and Convergence. CoRR abs/2112.03515 (2021) - 2020
- [j71]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Reinforcement learning algorithm for non-stationary environments. Appl. Intell. 50(11): 3590-3606 (2020) - [j70]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Successive Over-Relaxation ${Q}$ -Learning. IEEE Control. Syst. Lett. 4(1): 55-60 (2020) - [j69]Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar:
Generalized Speedy Q-Learning. IEEE Control. Syst. Lett. 4(3): 524-529 (2020) - [j68]Vinayaka G. Yaji, Shalabh Bhatnagar:
Stochastic Recursive Inclusions in Two Timescales with Nonadditive Iterate-Dependent Markov Noise. Math. Oper. Res. 45(4): 1405-1444 (2020) - [j67]Vinayaka G. Yaji, Shalabh Bhatnagar:
Analysis of Stochastic Approximation Schemes With Set-Valued Maps in the Absence of a Stability Guarantee and Their Stabilization. IEEE Trans. Autom. Control. 65(3): 1100-1115 (2020) - [j66]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random Directions Stochastic Approximation With Deterministic Perturbations. IEEE Trans. Autom. Control. 65(6): 2450-2465 (2020) - [c70]Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar:
Hierarchical Average Reward Policy Gradient Algorithms (Student Abstract). AAAI 2020: 13777-13778 - [c69]Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Varma, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach. CoRL 2020: 2257-2267 - [c68]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Convergent Off-Policy Temporal Difference Algorithm. ECAI 2020: 1103-1110 - [c67]Indu John, Shalabh Bhatnagar:
Deep Reinforcement Learning with Successive Over-Relaxation and its Application in Autoscaling Cloud Resources. IJCNN 2020: 1-6 - [c66]Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar:
Stochastic Game Frameworks for Efficient Energy Management in Microgrid Networks. ISGT-Europe 2020: 116-120 - [c65]Sindhu Padakandla, Shilpa Rao, Shalabh Bhatnagar:
Learning-Based Resource Allocation in Industrial IoT Systems. PIMRC 2020: 1-7 - [c64]Sashank Tirumala, Sagar Venkatesh Gubbi, Kartik Paigwar, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations. RO-MAN 2020: 1107-1112 - [i62]Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar:
A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks. CoRR abs/2002.02084 (2020) - [i61]Sashank Tirumala, Sagar Venkatesh Gubbi, Kartik Paigwar, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Learning Stable Manoeuvres in Quadruped Robots from Expert Demonstrations. CoRR abs/2007.14290 (2020) - [i60]Meet Gandhi, Atreyee Kundu, Shalabh Bhatnagar:
A reinforcement learning approach to hybrid control design. CoRR abs/2009.00821 (2020) - [i59]Dhuruva Priyan G. M, Abhik Singla, Shalabh Bhatnagar:
Hindsight Experience Replay with Kronecker Product Approximate Curvature. CoRR abs/2010.06142 (2020) - [i58]Kartik Paigwar, Lokesh Krishna, Sashank Tirumala, Naman Khetan, Aditya Sagi, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Robust Quadrupedal Locomotion on Sloped Terrains: A Linear Policy Approach. CoRR abs/2010.16342 (2020)
2010 – 2019
- 2019
- [j65]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
An Online Sample-Based Method for Mode Estimation Using ODE Analysis of Stochastic Approximation Algorithms. IEEE Control. Syst. Lett. 3(3): 697-702 (2019) - [j64]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Stability of Stochastic Approximations With "Controlled Markov" Noise and Temporal Difference Learning. IEEE Trans. Autom. Control. 64(6): 2614-2620 (2019) - [c63]Ajin George Joseph, Shalabh Bhatnagar:
Stochastic Approximation Trackers for Model-Based Search. Allerton 2019: 741-748 - [c62]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Prabuchandran K. J., Shalabh Bhatnagar:
Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning. AAMAS 2019: 1931-1933 - [c61]Ajin George Joseph, Shalabh Bhatnagar:
An Adaptive and Incremental Approach to Quantile Estimation. CDC 2019: 6025-6031 - [c60]Indu John, Shalabh Bhatnagar:
Efficient Budget Allocation and Task Assignment in Crowdsourcing. COMAD/CODS 2019: 318-321 - [c59]Indu John, Ravikumar Karumanchi, Shalabh Bhatnagar:
Predictive and Prescriptive Analytics for Performance Optimization: Framework and a Case Study on a Large-Scale Enterprise System. ICMLA 2019: 876-881 - [c58]Abhik Singla, Shounak Bhattacharya, Dhaivat Dholakiya, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives. ICRA 2019: 7434-7440 - [c57]Shounak Bhattacharya, Abhik Singla, Abhimanyu, Dhaivat Dholakiya, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots. RO-MAN 2019: 1-6 - [c56]Shishir Kolathaya, Ashitava Ghosal, Bharadwaj Amrutur, Ashish Joglekar, Suhan Shetty, Dhaivat Dholakiya, Abhimanyu, Aditya Sagi, Shounak Bhattacharya, Abhik Singla, Shalabh Bhatnagar:
Trajectory based Deep Policy Search for Quadrupedal Walking. RO-MAN 2019: 1-6 - [c55]Indu John, Aiswarya Sreekantan, Shalabh Bhatnagar:
Efficient Adaptive Resource Provisioning for Cloud Applications using Reinforcement Learning. FAS*W@SASO/ICAC 2019: 271-272 - [i57]Dhaivat Dholakiya, Shounak Bhattacharya, Ajay Gunalan, Abhik Singla, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Design, Development and Experimental Realization of a Quadrupedal Research Platform: Stoch. CoRR abs/1901.00697 (2019) - [i56]Chandramouli K, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
An Online Sample Based Method for Mode Estimation using ODE Analysis of Stochastic Approximation Algorithms. CoRR abs/1902.03806 (2019) - [i55]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Successive Over Relaxation Q-Learning. CoRR abs/1903.03812 (2019) - [i54]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Prabuchandran K. J., Shalabh Bhatnagar:
Actor-Critic Algorithms for Constrained Multi-agent Reinforcement Learning. CoRR abs/1905.02907 (2019) - [i53]Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar:
Second Order Value Iteration in Reinforcement Learning. CoRR abs/1905.03927 (2019) - [i52]Sindhu Padakandla, Prabuchandran K. J., Shalabh Bhatnagar:
Reinforcement Learning in Non-Stationary Environments. CoRR abs/1905.03970 (2019) - [i51]Shounak Bhattacharya, Abhik Singla, Abhimanyu, Dhaivat Dholakiya, Shalabh Bhatnagar, Bharadwaj Amrutur, Ashitava Ghosal, Shishir Kolathaya:
Learning Active Spine Behaviors for Dynamic and Efficient Locomotion in Quadruped Robots. CoRR abs/1905.06077 (2019) - [i50]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
Solution of Two-Player Zero-Sum Game by Successive Relaxation. CoRR abs/1906.06659 (2019) - [i49]Indu John, Chandramouli Kamanchi, Shalabh Bhatnagar:
Generalized Speedy Q-learning. CoRR abs/1911.00397 (2019) - [i48]Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar:
A Convergent Off-Policy Temporal Difference Algorithm. CoRR abs/1911.05697 (2019) - [i47]Akshay Dharmavaram, Matthew Riemer, Shalabh Bhatnagar:
Hierarchical Average Reward Policy Gradient Algorithms. CoRR abs/1911.08826 (2019) - [i46]Sashank Tirumala, Aditya Sagi, Kartik Paigwar, Ashish Joglekar, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Gait Library Synthesis for Quadruped Robots via Augmented Random Search. CoRR abs/1912.12907 (2019) - 2018
- [j63]Enlu Zhou, Shalabh Bhatnagar:
Gradient-Based Adaptive Stochastic Search for Simulation Optimization Over Continuous Space. INFORMS J. Comput. 30(1): 154-167 (2018) - [j62]Ajin George Joseph, Shalabh Bhatnagar:
An incremental off-policy search in a model-free Markov decision process using a single sample path. Mach. Learn. 107(6): 969-1011 (2018) - [j61]Ajin George Joseph, Shalabh Bhatnagar:
An online prediction algorithm for reinforcement learning with linear function approximation using cross entropy method. Mach. Learn. 107(8-10): 1385-1429 (2018) - [j60]Prasenjit Karmakar, Shalabh Bhatnagar:
Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning. Math. Oper. Res. 43(1): 130-151 (2018) - [j59]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar, Csaba Szepesvári:
A Linearly Relaxed Approximate Linear Program for Markov Decision Processes. IEEE Trans. Autom. Control. 63(4): 1185-1191 (2018) - [j58]Arunselvan Ramaswamy, Shalabh Bhatnagar:
Analysis of Gradient Descent Methods With Nondiminishing Bounded Errors. IEEE Trans. Autom. Control. 63(5): 1465-1471 (2018) - [j57]Shalabh Bhatnagar, Sanjeev Patel, Karmeshu:
A stochastic approximation approach to active queue management. Telecommun. Syst. 68(1): 89-104 (2018) - [j56]Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar:
Novel Sensor Scheduling Scheme for Intruder Tracking in Energy Efficient Sensor Networks. IEEE Wirel. Commun. Lett. 7(5): 712-715 (2018) - [c54]Chandramouli K, Prabuchandran K. J., Sai Koti Reddy Danda, Shalabh Bhatnagar:
Generalized Deterministic Perturbations For Stochastic Gradient Search. CDC 2018: 5734-5739 - [c53]Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar:
A unified decision making framework for supply and demand management in microgrid networks. SmartGridComm 2018: 1-7 - [i45]Ajin George Joseph, Shalabh Bhatnagar:
An Incremental Off-policy Search in a Model-free Markov Decision Process Using a Single Sample Path. CoRR abs/1801.10287 (2018) - [i44]Ajin George Joseph, Shalabh Bhatnagar:
A Cross Entropy based Optimization Algorithm with Global Convergence Guarantees. CoRR abs/1801.10291 (2018) - [i43]Ajin George Joseph, Shalabh Bhatnagar:
An Online Prediction Algorithm for Reinforcement Learning with Linear Function Approximation using Cross Entropy Method. CoRR abs/1806.06720 (2018) - [i42]Prashanth L. A., Shalabh Bhatnagar, Nirav Bhavsar, Michael C. Fu, Steven I. Marcus:
Random directions stochastic approximation with deterministic perturbations. CoRR abs/1808.02871 (2018) - [i41]Abhik Singla, Shounak Bhattacharya, Dhaivat Dholakiya, Shalabh Bhatnagar, Ashitava Ghosal, Bharadwaj Amrutur, Shishir Kolathaya:
Realizing Learned Quadruped Locomotion Behaviors through Kinematic Motion Primitives. CoRR abs/1810.03842 (2018) - [i40]Abhik Singla, Sindhu Padakandla, Shalabh Bhatnagar:
Memory-based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge. CoRR abs/1811.03307 (2018) - 2017
- [j55]Chandrashekar Lakshminarayanan, Shalabh Bhatnagar:
A stability criterion for two timescale stochastic approximation schemes. Autom. 79: 108-114 (2017) - [j54]K. Lakshmanan, Shalabh Bhatnagar:
Quasi-Newton smoothed functional algorithms for unconstrained and constrained simulation optimization. Comput. Optim. Appl. 66(3): 533-556 (2017) - [j53]Arunselvan Ramaswamy, Shalabh Bhatnagar:
A Generalization of the Borkar-Meyn Theorem for Stochastic Recursive Inclusions. Math. Oper. Res. 42(3): 648-661 (2017) - [j52]Prashanth L. A., Shalabh Bhatnagar, Michael C. Fu, Steven I. Marcus:
Adaptive System Optimization Using Random Directions Stochastic Approximation. IEEE Trans. Autom. Control. 62(5): 2223-2238 (2017) - [j51]Karmeshu, Sanjeev Patel, Shalabh Bhatnagar:
Adaptive mean queue size and its rate of change: queue management with random dropping. Telecommun. Syst. 65(2): 281-295 (2017) - [c52]Sandeep Kumar, Sindhu Padakandla, Chandrashekar Lakshminarayanan, Priyank Parihar, K. Gopinath, Shalabh Bhatnagar:
Scalable Performance Tuning of Hadoop MapReduce: A Noisy Gradient Approach. CLOUD 2017: 375-382 - [c51]Ajin George Joseph, Shalabh Bhatnagar:
A model based search method for prediction in model-free Markov decision process. IJCNN 2017: 170-177 - [c50]Ajin George Joseph, Shalabh Bhatnagar:
Bounds for off-policy prediction in reinforcement learning. IJCNN 2017: 3991-3997 - [c49]