Skip to content

Commit 2b9177f

Browse files
authored
Merge pull request #88 from rajdeepsh/main
Added ICSE'25 Paper & Updated Profile
2 parents c506a47 + 5052ffe commit 2b9177f

File tree

6 files changed

+95
-1
lines changed

6 files changed

+95
-1
lines changed

content/authors/rajdeep-sh/_index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -53,7 +53,7 @@ social:
5353
# link: https://scholar.google.co.uk/citations?user=sIwtMXoAAAAJ
5454
- icon: github
5555
icon_pack: fab
56-
link: https://github.com/rajfly
56+
link: https://github.com/rajdeepsh
5757
# Link to a PDF of your resume/CV from the About widget.
5858
# To enable, copy your resume/CV to `static/files/cv.pdf` and uncomment the lines below.
5959
# - icon: cv
22.7 KB
Loading

content/authors/rajdeep-sh/avatar.jpg

-116 KB
Binary file not shown.
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
---
2+
title: Our paper "On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations" was accepted at ICSE '25!
3+
date: 2025-02-10
4+
---
5+
6+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
@inproceedings{mistakenassumption,
2+
author = {Hundal, Rajdeep Singh and Xiao, Yan and Cao, Xiaochun and Dong, Jin Song and Rigger, Manuel},
3+
title = {On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations},
4+
year = {2025},
5+
publisher = {Association for Computing Machinery},
6+
address = {New York, NY, USA},
7+
abstract = {Deep Reinforcement Learning (DRL) is a paradigm of artificial intelligence where an agent uses a neural network to learn which actions to take in a given environment. DRL has recently gained traction from being able to solve complex environments like driving simulators, 3D robotic control, and multiplayer-online-battle-arena video games. Numerous implementations of the state-of-the-art algorithms responsible for training these agents, like the Deep Q-Network (DQN) and Proximal Policy Optimization (PPO) algorithms, currently exist. However, studies make the mistake of assuming implementations of the same algorithm to be consistent and thus, interchangeable. In this paper, through a differential testing lens, we present the results of studying the extent of implementation inconsistencies, their effect on the implementations' performance, as well as their impact on the conclusions of prior studies under the assumption of interchangeable implementations. The outcomes of our differential tests showed significant discrepancies between the tested algorithm implementations, indicating that they are not interchangeable. In particular, out of the five PPO implementations tested on 56 games, three implementations achieved superhuman performance for 50% of their total trials while the other two implementations only achieved superhuman performance for less than 15% of their total trials. Furthermore, the performance among the high-performing PPO implementations was found to differ significantly in nine games. As part of a meticulous manual analysis of the implementations' source code, we analyzed implementation discrepancies and determined that code-level inconsistencies primarily caused these discrepancies. Lastly, we replicated a study and showed that this assumption of implementation interchangeability was sufficient to flip experiment outcomes. Therefore, this calls for a shift in how implementations are being used. In addition, we recommend for (1) replicability studies for studies mistakenly assuming implementation interchangeability, (2) DRL researchers and practitioners to adopt the differential testing methodology proposed in this paper to combat implementation inconsistencies, and (3) the use of large environment suites.},
8+
booktitle = {Proceedings of the IEEE/ACM 47th International Conference on Software Engineering},
9+
numpages = {13},
10+
keywords = {reinforcement learning, differential testing},
11+
location = {Ottawa, Canada},
12+
series = {ICSE '25}
13+
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
---
2+
title: "On the Mistaken Assumption of Interchangeable Deep Reinforcement Learning Implementations"
3+
authors:
4+
- Rajdeep Sh
5+
- Yan Xiao
6+
- Xiaochun Cao
7+
- Jin Song Dong
8+
- Manuel Rigger
9+
date: "2025-05-01T00:00:00Z"
10+
doi: ""
11+
12+
# Schedule page publish date (NOT publication's date).
13+
publishDate: "2025-02-08T00:00:00Z"
14+
15+
# Publication type.
16+
# Legend: 0 = Uncategorized; 1 = Conference paper; 2 = Journal article;
17+
# 3 = Preprint / Working Paper; 4 = Report; 5 = Book; 6 = Book section;
18+
# 7 = Thesis; 8 = Patent
19+
publication_types: ["1"]
20+
21+
# Publication name and optional abbreviated publication name.
22+
publication: In *Proceedings of the 47th International Conference on Software Engineering*
23+
publication_short: In *ICSE 2025*
24+
25+
# abstract: Database systems are widely used to store and query data. Test oracles have been proposed to find logic bugs in such systems, that is, bugs that cause the database system to compute an incorrect result. To realize a fully automated testing approach, such test oracles are paired with a test case generation technique; a test case refers to a database state and a query on which the test oracle can be applied. In this work, we propose the concept of Query Plan Guidance (QPG) for guiding automated testing towards "interesting" test cases. SQL and other query languages are declarative. Thus, to execute a query, the database system translates every operator in the source language to one of potentially many so-called physical operators that can be executed; the tree of physical operators is referred to as the query plan. Our intuition is that by steering testing towards exploring diverse query plans, we also explore more interesting behaviors—some of which are potentially incorrect. To this end, we propose a mutation technique that gradually applies promising mutations to the database state, causing the DBMS to create diverse query plans for subsequent queries. We applied our method to three mature, widely-used, and extensively-tested database systems—SQLite, TiDB, and CockroachDB—and found 53 unique, previously unknown bugs. Our method exercises 4.85—408.48× more unique query plans than a naive random generation method and 7.46× more than a code coverage guidance method. Since most database systems—including commercial ones—expose query plans to the user, we consider QPG a generally applicable, black-box approach and believe that the core idea could also be applied in other contexts (e.g., to measure the quality of a test suite).
26+
# Summary. An optional shortened abstract.
27+
# summary: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Duis posuere tellus ac convallis placerat. Proin tincidunt magna sed ex sollicitudin condimentum.
28+
29+
#tags:
30+
#- Source Themes
31+
#featured: true
32+
33+
#links:
34+
#- name: Custom Link
35+
# url: http://example.org
36+
# url_pdf: https://dl.acm.org/doi/pdf/10.1145/3597503.3623307
37+
#url_code: '#'
38+
#url_dataset: '#'
39+
#url_poster: '#'
40+
#url_project: ''
41+
#url_slides: ''
42+
#url_source: '#'
43+
#url_video: '#'
44+
45+
# Featured image
46+
# To use, add an image named `featured.jpg/png` to your page's folder.
47+
#image:
48+
# caption: 'Image credit: [**Unsplash**](https://unsplash.com/photos/pLCdAaMFLTE)'
49+
# focal_point: ""
50+
# preview_only: false
51+
52+
# Associated Projects (optional).
53+
# Associate this publication with one or more of your projects.
54+
# Simply enter your project's folder or file name without extension.
55+
# E.g. `internal-project` references `content/project/internal-project/index.md`.
56+
# Otherwise, set `projects: []`.
57+
#projects:
58+
#- sqlancer
59+
60+
# Slides (optional).
61+
# Associate this publication with Markdown slides.
62+
# Simply enter your slide deck's filename without extension.
63+
# E.g. `slides: "example"` references `content/slides/example/index.md`.
64+
# Otherwise, set `slides: ""`.
65+
#slides:
66+
67+
# move below
68+
#{{% callout note %}}
69+
#Click the *Cite* button above to demo the feature to enable visitors to import publication metadata into their reference management software.
70+
#{{% /callout %}}
71+
72+
#Supplementary notes can be added here, including [code and math](https://sourcethemes.com/academic/docs/writing-markdown-latex/).
73+
74+
---
75+

0 commit comments

Comments
 (0)