-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Environment hangs up when spawning from different processes #75
Comments
Hi @tianfanzhu Can you confirm that you are running the latest version of Obstacle Tower (v1.2)? Also, does it work when using the basic usage python notebook we provide as an example? |
Hi @awjuliani , I am indeed running this on the latest version, v1.2. Also, I found out from the basic usage notebook that the screen is gray, as shown above, until the env is reset or stepped. |
I have the same problem. |
Me too. Except I'm running this via the Unity Obstacle Tower Challenge run.py script, and at startup I see the game character appear and then fall off the blank screen into nothingness. After that, empty gray screen. iMac running 10.14.4 |
However, when I click on the obstacletower.app file directly, it runs flawlessly. |
Hi all, it may be difficult to tell whether everyone is experiencing the same issue. A couple of important things to note:
So with that in mind: @NancyFulda are you running in evaluation mode or just directly running the |
Hi @harperj, thanks for looking into this! I'm directly executing run.py. Interestingly, the behavior this morning is different than it was on Saturday (maybe I rebooted in between??) I still see grayness, but the game character does not appear anymore. However, the run.py script no longer hangs, but instead prints out the reward for each episode. Is this the expected behavior? It would be nice to be able to watch the agent's character navigate the world (to see where it's messing up), but since the environment seems to be executing at faster-than-real-time speed, maybe the grayed out screen is normal? The UnitySDK.log contents are as follows: 4/1/2019 1:35:59 PM Log Log Log Log Log Log Log Log Log Log Log Log Log Log Log Log Log Log |
@NancyFulda This is the expected behavior. When training, the camera isn't turned on in order to improve performance. You can see the camera by turning on realtime mode in the environment ( |
@harperj Ah, that worked perfectly! Everything seems to be in order now. Thank you! |
Hi @harperj @awjuliani I also encountered the same issue: I tried to use ML-Agent 0.8.1 by simply let options['--env'] = 'ObstacleTower/ObstalceTower' After launching 2 envs, 1 env had the agent just spawning and falling down, another env just 'not responding', and my cpu and gpu usage of the falling-down agent env is very high. This issue occurs in my Windows machine (Windows10), but it has no problem with the same setting on my Mac, also I've checked that I'm using ObstacleTower-v1.3 Here's the reference video [https://youtu.be/u-J7mlwlmr0] |
I was able to get large-scale-curiosity + Obstacle Challenge working up to about 32 agents
@karta1297963 what you see in your video is what happens when the Unity environment does not sync with Python. Even with everything I did above, I still see this 1 in 5 times when starting off a run (even with different code bases) |
Like @Sohojoe said, this looks like an issue with the connection between Python and Obstacle Tower / Unity. It could be that the port is in use for something else, that the worker_id is not being set correctly, or that the environment takes longer than the timeout_wait to start up. You could potentially have your script fail gracefully and re-launch on timeout as well, or try a new worker_id if you have a reserved port that conflicts. |
@Sohojoe @harperj thanks for helping, I built 2 instances with mlagent default task - Pyramids with SDK v0.6 and v0.8 respectively, turns out one with v0.6 has the same sync issue while v0.8 instance doesn't. I guess the possible solutions:
|
@karta1297963 - what platform / OS are you using? |
@Sohojoe I'm using Windows 10. |
@karta1297963 - create a simple repro that spawns many instances as an example of how i do it - https://github.com/Sohojoe/many_towers |
Hello,
I'm trying to run the obstacle tower environment through the large-scale-curiosity project. However, it seems to hangup when it tries to create the environment from its subprocesses. It prints out that the CrashReporter is initalized and the mono config paths, then does nothing for a while and hangs up with the following image:
This is run on a MBP 13'' 2018, without a GPU. Any way to troubleshoot and debug this? I can't really do anything as there aren't really logged anything from inside the environment.
The text was updated successfully, but these errors were encountered: