Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source code files for the DeepDataFlow dataset #156

Closed
zehanort opened this issue Mar 17, 2021 · 3 comments
Closed

Source code files for the DeepDataFlow dataset #156

zehanort opened this issue Mar 17, 2021 · 3 comments

Comments

@zehanort
Copy link

Why aren't the source code files of the DeepDataFlow dataset included? Is there any particular reason why you have included the IR files only, without the respective source code files? If not, would it be possible to include them? I need them in order to run some experiments with the ProGraML pipeline while also extracting some static features specifically from the OpenCL source code files. In general, I believe that including the source code files along with the IR files in your dataset would be very helpful for many researchers.

@zehanort zehanort changed the title Source code for the DeepDataFlow dataset Source code files for the DeepDataFlow dataset Mar 17, 2021
@ChrisCummins
Copy link
Owner

In general, I believe that including the source code files along with the IR files in your dataset would be very helpful for many researchers.

I agree, and I do regret not uploading the sources. There wasn't any particular need for omitting them, they just weren't relevant for the IR-level experiments I wanted to do.

If you are only interested in the OpenCL files, then you can download the Device Mapping Dataset as that includes the OpenCL sources as well as the LLVM-IR and ProGraML graphs.

Hope that helps.

Cheers,
Chris

@zehanort
Copy link
Author

Thank you very much, this really helps a lot!

One last question: I am having a hard time figuring out how to train your system and produce new predictions (and accuracy scores) for my own machine (with my own CPU/GPU pair and my own labels after some experiments on the devmap dataset that you provided). I wonder if you could give me some guidelines regarding this, as I haven't found anything in the documentation?

(I'm really sorry if this is simple/straightforward and it is just a result of my unfamiliarity with bazel...)

Thanks again in advance!

@ChrisCummins
Copy link
Owner

Glad that helps.

We never finished open-sourcing the code for graph-level classifications with ProGraML. We got pretty close but then both of the core developers got new jobs and now struggle to find the time to commit to finish it off :) Our most recent work-in-progress is here: #107

Easiest way to produce new results might be to use an off-the-shelf GGNN implementation from a library like DGL.

Cheers,
Chris

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants