-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallelize/Optimize Device Simulation Logic #53
Comments
Hello, I was wondering if any advancement had been made on this issue? I would get started on it otherwise. Thank you, |
Hi @Philippe-Drolet, I have prioritized the implementation of You are welcome to contribute yourself! I'm happy to answer any questions you may have. |
Hello, So I have started work on this, I was curious as to what you would recommend for debugging the cuda files when using them with the python interface. So far, I have created a new pytest with the debug networks that you have defined but when I get to debugging my new .cu files, I cannot access them line by line as I would a regular python file. I am trying visual studio code right now to run the tests, what IDE are you using (I suppose its impossible to debug the c++ files with pycharm). Any guidance would help and I am also simply curious as to how you do it. Also, do you have any documentation as per the purpose of matrices ABCD_E from simulate passive? Thanks! |
Hi @Philippe-Drolet, Sure- my preferred method of debugging is to use This tool can be used when executing a Python script that calls a In addition,
Hopefully, this helps! I'm happy to answer any further questions you may have. [1] A. Chen, “A Comprehensive Crossbar Array Model With Solutions for Line Resistance and Nonlinear Device Characteristics,” IEEE Transactions on Electron Devices, vol. 60, no. 4, pp. 1318–1326, Apr. 2013, doi: 10.1109/ted.2013.2246791. |
Thank you very much for this response, I will i go for the good old printf approach, it seems to work so far! |
Currently, when performing inference, or programming devices in passive arrays (0T1R arrangements), devices are simulated in a sequential manor. CUDA kernels and other optimization methods can be used to drastically improve performance, as some specific operations are not easily parallelizable using the Python API.
The text was updated successfully, but these errors were encountered: