Journal papers

S. Agrawal, R. Jia, "Learning in Structured MDPs with Convex Cost Functions: improved regret bounds for inventory management". Accepted to appear in Operations Research journal.

S. Agrawal and R. Jia, "Optimistic posterior sampling for reinforcement learning: worst-case regret bounds". Accepted to appear in Mathematics in Operations Research journal.

S. Agrawal, "Recent Advances in Multiarmed Bandits for Sequential Decision Making", INFORMS TutORials in Operations Research, Operations Research & Management Science in the Age of Analytics, Pages 167-188, October 2019.

S. Agrawal, V. Avandhanula, V. Goyal, A. Zeevi, "MNL-Bandit: A Dynamic Learning Approach to Assortment Selection". Operations Research, 67(5):1209-1502 (2019).

S. Agrawal, N. R. Devanur, "Bandits with global convex constraints and objective". Operations Research 67(5):1486-1502 (2019).

S. Agrawal, N. Goyal, "Near-optimal regret bounds for Thompson Sampling", Journal of the ACM (JACM), Volume 64 Issue 5, October 2017. [ EE ][ pdf ].
(The pdf version here includes a minor correction made to the published version: in Lemma 2.13, page 15:16. Modified text is in blue)

S. Agrawal, Z. Wang and Y. Ye, "A Dynamic Near-Optimal Algorithm for Online Linear Programming". Operations Research 62:876-890 (2014). [ EE ] [ arXiv ]

S. Agrawal, Y. Ding, A. Saberi, and Y. Ye, "Price of Correlations in Stochastic Optimization". Operations Research 60:243-248 (2012). [ EE ]

S. Agrawal, E. Delage, M. Peters, Z. Wang, and Y. Ye, "A Unified Framework for Dynamic Prediction Market Design". Operations Research 59:3:550-568 (2011). [ EE ] [ arXiv ]

S. Agrawal, N. Megiddo and B. Armbruster, "Equilibrium in Prediction Markets with Buyers and Sellers". Economic Letters 109:46-49 (2010). [ EE ] [ pdf ]

S. Agrawal, J.R. Haritsa and B.A. Prakash, "FRAPP: A Framework for High-Accuracy Privacy-Preserving Mining". Data Mining and Knowledge Discovery Journal 18:101-139 (2009). [ EE ] [ arXiv ]

S. Agrawal, C. N. Kanthi, K. V. M. Naidu, J. Ramamirtham, R. Rastogi, S. Satkin, and A. Srinivasan, Monitoring infrastructure for converged networks and services". Bell Labs Technical Journal 12(2): 63-77 (2007). [ EE ]

Peer-reviewed conference papers and manuscripts

Z. Zhang, S. Agrawal, I. Lobel, S. R. Sinclair, C. L. Yu, Reinforcement Learning in MDPs with Information-Ordered Policies, Aug 2025 [ arXiv ]

P. Agrawal, S. Agrawal, A. Azati, Q-learning with Posterior Sampling, June 2025. [ arXiv ]

P. Agrawal, S. Agrawal, Optimistic Q-learning for average reward and episodic reinforcement learning, COLT 2025. [ arXiv ]

S. Agrawal, W. Tang, Dynamic pricing and learning with long-term reference effects, EC 2024. [ arXiv]

S. Agrawal, Y. Feng, W. Tang, Dynamic pricing and learning with Bayesian persuasion, NeurIPS 2023 [ arXiv ]

S. Yin, S. Agrawal, A. Zeevi, Online Allocation and Learning in the Presence of Strategic Agents, Neurips 2022. [ arXiv ]

S. R. Putta, S. Agrawal, Scale Free Adversarial Multi Armed Bandits, ALT 2022. [ arXiv ]

S. Agrawal, S. Yin, A. Zeevi, Dynamic Pricing and Learning under Bass Model, EC 2021.[arXiv]

S. Agrawal, E. Balkanski, V. Mirrokni, B. Sivan, Dynamic First Price Auctions Robust to Heterogeneous Buyers EC 2021. [arXiv]

On optimal ordering in the optimal stopping problem, Shipra Agrawal, Jaychandran Sethuraman, Xingyu Zhang. EC 2020. [arXiv]

Reinforcement Learning for Integer Programming: Learning to Cut, Yunhao Tang, Shipra Agrawal, Yuri Faenza. ICML 2020. [arXiv]

S. Agrawal and R. Jia, Learning in structured MDPs with convex cost functions: Improved regret bounds for inventory management. ACM conference on Eoconomics and Computation (EC) 2019. [arXiv]

S. Agrawal, M. Shadravan, C. Stein, Submodular Secretary Problem with Shortlists, ITCS (Innovations in Theoretical Computer Science) 2019. [ arXiv ]

S. Agrawal, V. Mirrokni, M. Zadimoghaddam, Proportional Allocation: Simple, Distributed, and Diverse Matching with High Entropy, ICML 2018. [ pdf ]

C. Pike-Burke, S. Agrawal, S. Grunewalder, C. Szepesvari, Bandits with Delayed, Aggregated Anonymous Feedback, ICML 2018. [ arXiv ][ICML]

S. Agrawal, C. Daskalakis, V. Mirrokni, B. Sivan, Robust Repeated Auctions under Heterogeneous Buyer Behavior, EC 2018. [ arXiv ]

Y. Tang, S. Agrawal, Exploration by Distributional Reinforcement Learning, IJCAI 2018. [ arXiv ]

S. Agrawal and R. Jia, "Optimistic posterior sampling for reinforcement learning: worst-case regret bounds". NIPS 2017 (spotlight) [ arXiv ]

S. Agrawal, V. Avadhanula, V. Goyal, A. Zeevi, "Thompson Sampling for MNL-bandit". Conference on Learning Theory (COLT), 2017. [ arXiv ]

S. Agrawal, N. R. Devanur, "Linear Contextual Bandits with Knapsacks". NIPS 2016. [EE][ arXiv ]

S. Agrawal, V. Avadhanula, V. Goyal, A. Zeevi, "An Exploration-Exploitation Approach for Assortment Selection". ACM conference on Eoconomics and Computation (EC) 2016. [ EE ][ pdf ]

S. Agrawal, N. R. Devanur, L. Li, "An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives". Conference on Learning Theory (COLT) 2016. [EE][arXiv ]

S. Agrawal, N. R. Devanur, "Fast algorithms for online stochastic convex programming". In Proceedings of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2015. [ EE ] [ arXiv ]

S. Agrawal, N. R. Devanur, "Bandits with concave rewards and convex knapsacks". In Proceedings of the 15th ACM Conference on Electronic Commerce (EC), 2014. [ EE ] [ arXiv ]

T. Kocak, M. Valko and R. Munos, S. Agrawal, "Spectral Thompson Sampling". In Proceedings of the 28th AAAI Conference on Artificial Intelligence (AAAI), 2014. [ pdf ]

S. Agrawal, N. Goyal, "Thompson Sampling for contextual bandits with linear payoffs". In Proceedings of the 30th International Conference on Machine Learning (ICML), 2013. [ pdf ] [ arXiv ]

S. Agrawal, N. Goyal, "Further optimal regret bounds for Thompson Sampling", In Proceedings of the 16th International Conference on Artificial Intelligence and Statistics (AISTATS), 2013. [ pdf ] [ arXiv ]

S. Agrawal, N. Goyal, "Analysis of Thompson Sampling for the multi-armed bandit problem". In Proceedings of the 25th Annual Conference on Learning Theory (COLT), 2012. [ pdf ] [ arXiv ]

"Correlation Robust Stochastic Optimization". In Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), 2010. [ EE ] [ arXiv ]

"A Unified Framework for Dynamic Parimutuel Information Market Design". In Proceedings of the 10th ACM Conference on Electronic Commerce (EC), 2009. [ EE ]

S. Agrawal, Z. Wang and Y. Ye, "Parimutuel Betting on Permutations". In Proceedings of the 4th International Workshop On Internet And Network Economics (WINE), 2008. [ EE ] [ arXiv ]

S. Agrawal, K.V.M. Naidu and R. Rastogi, "Diagnosing Link-Level Anomalies Using Passive Probes". In Proceedings of the 26th Annual IEEE Conference on Computer Communications (INFOCOM), 2007. [ EE ]

S. Agrawal, S. Deb, K.V.M. Naidu, and R. Rastogi, Efficient Detection of Distributed Constraint Violations". Short paper. In Proceedings of the 23rd International Conference on Data Engineering (ICDE), 2007. [ EE ]

S. Agrawal, P.P.S. Narayan, J. Ramamirtham, R. Rastogi, M. Smith, K. Swanson, and M. Thottan, "VoIP service quality monitoring using active and passive probes" . In Proceedings of the First International Conference on COM- munication System softWAre and MiddlewaRE (COMSWARE), 2006. [ EE ]

S. Agrawal, J.R. Haritsa, "A Framework for High-Accuracy Privacy-Preserving Mining". In Proceedings of the 21st International Conference on Data Engineering (ICDE), 2005. [ EE ] [ arXiv ]

S. Agrawal, V. Krishnan and J.R. Haritsa, On Addressing Efficiency Concerns in Privacy-Preserving Mining". In Proceedings of the 9th International Conference on Database Systems for Advanced Applications (DASFAA), 2004. [ EE ] [ arXiv ]