Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infinite loop between ExceptionCatchingBundleCollectionInitializer and BundleCollection.GetReadLock #460

Open
mhoyer opened this issue Sep 19, 2014 · 3 comments

Comments

@mhoyer
Copy link

mhoyer commented Sep 19, 2014

We recently found weird behavior related to Cassette in our web application for a quite large customer. Sometimes a HTTP requests from a users browser took minutes to respond on our live environment. We sadly couldn't reproduce it on our internal environments so the only way to find out about the actual reason was by dumping the w3p.exe process of live IIS and inspecting it with WinDbg. So fasten your seatbelt and see what we did:

Requirements

I used

  • latest WinDbg from WDK 8.1 (Standalone Debugging Tools for Windows (WinDbg))
  • SOSEX on the upper right: Vista+/x64
    ** extract the .zip and copy the .dll to "c:\Program Files (x86)\Windows Kits\8.1\Debuggers\x64"

Procedure

Open WinDbg and with File > Load Crash Dump I loaded the dump.

Preparing the session

.loadby sos clr
.load sosex
!bhi

Find the threads with locks

0:0> !mlocks

ClrThread  DbgThread  OsThread    LockType    Lock              LockLevel
------------------------------------------------------------------------------
....
0xb2       113        0x663c0     thinlock    000000f441022170  (recursion:0)
0x8e  !!   140        0x68eb4     SyncBlock   000000f87cf57408                
0x8e  !!   140        0x68eb4     RWLockSlim  000000f240dbb3e0  Writer        
0x8e       140        0x68eb4     thinlock    000000f4014e8440  (recursion:0)
....

Okay. DbgThread 140 looks interesting. So I switched to it with:

0:0> ~140s

Now list the full stack trace of this thread:

0:140> !mk

Scrolling through the very long list of calls in the stack I found a lot (more than 100 times) of loops over those 8 lines:

0:140> !mk
...
19d:M 000000f18abdc080 00007ffcb55f8fd0 Cassette.Caching.ResourceAssetCacheValidator.IsValid(System.String, System.DateTime)(+0xb IL,+0x40 Native)
19e:M 000000f18abdc110 00007ffcb55f8d93 Cassette.Caching.ManifestValidator.Cassette.IBundleVisitor.Visit(Cassette.IAsset)(+0x28 IL,+0x63 Native)
--- 
19f:M 000000f18abdc150 00007ffcb55f892b Cassette.CachedBundleContent.Accept(Cassette.IBundleVisitor)(+0x17 IL,+0x5b Native)
1a0:M 000000f18abdc1a0 00007ffcb55f51d6 Cassette.Bundle.Accept(Cassette.IBundleVisitor)(+0x1f IL,+0xa6 Native)
1a1:M 000000f18abdc230 00007ffcb55f8b99 Cassette.Caching.ManifestValidator.IsValid(Cassette.Caching.Manifest)(+0x2f IL,+0x69 Native)
1a2:M 000000f18abdc280 00007ffcb0229fb8 Cassette.CacheAwareBundleCollectionInitializer.Initialize(Cassette.BundleCollection)(+0x42 IL,+0xa8 Native)
1a3:M 000000f18abdc2f0 00007ffcb0229e42 Cassette.ExceptionCatchingBundleCollectionInitializer.Initialize(Cassette.BundleCollection)(+0xc IL,+0x32 Native)
1a4:M 000000f18abdc330 00007ffcb55f91c9 Cassette.BundleCollection.GetReadLock()(+0x36 IL,+0xa9 Native)
1a5:M 000000f18abdc3b0 00007ffcb55f8fd0 Cassette.Caching.ResourceAssetCacheValidator.IsValid(System.String, System.DateTime)(+0xb IL,+0x40 Native)
1a6:M 000000f18abdc440 00007ffcb55f8d93 Cassette.Caching.ManifestValidator.Cassette.IBundleVisitor.Visit(Cassette.IAsset)(+0x28 IL,+0x63 Native)
--- 
1a7:M 000000f18abdc480 00007ffcb55f892b Cassette.CachedBundleContent.Accept(Cassette.IBundleVisitor)(+0x17 IL,+0x5b Native)
1a8:M 000000f18abdc4d0 00007ffcb55f51d6 Cassette.Bundle.Accept(Cassette.IBundleVisitor)(+0x1f IL,+0xa6 Native)
...

Thus, I inspected all involved types of those lines:

  • Cassette.Bundle
  • Cassette.BundleCollection
  • Cassette.CacheAwareBundleCollectionInitializer
  • Cassette.CachedBundleContent
  • Cassette.Caching.ManifestValidator
  • Cassette.Caching.ResourceAssetCacheValidator
  • Cassette.ExceptionCatchingBundleCollectionInitializer

After digging into source code of Cassette for those particular types I found BundleCollection.GetReadLock() and ExceptionCatchingBundleCollectionInitializer quite interesting regarding the stateful property BundleCollection.InitializationException.

Now I ran the !mdso command for this particular type holding the BundleCollection.InitializationException and click the Object address link in the output:

0:140> !mdso /t:Cassette.BundleCollection

Thread 140:
Location          Object            Type
------------------------------------------------------------
000000f18abd2010  000000f240dbb340  Cassette.BundleCollection
                  ^^^^^^^^^^^^^^^^

This link (000000f240dbb340) automatically executed this command:

0:140> !mdt 000000f240dbb340
000000f240dbb340 (Cassette.BundleCollection)
    bundles:000000f240dbb398 (System.Collections.Generic.List`1[[Cassette.Bundle, Cassette]])
    settings:000000f240d9d0f8 (Cassette.CassetteSettings)
    fileSearchProvider:000000f240dba978 (Cassette.FileSearchProvider)
    bundleFactoryProvider:000000f240dba9e8 (Cassette.BundleFactoryProvider)
    bundleCollectionInitializer:000000f240dbb328 (Cassette.ExceptionCatchingBundleCollectionInitializer)
    readerWriterLock:000000f240dbb3e0 (System.Threading.ReaderWriterLockSlim)
    bundleImmediateReferences:000000f240e376d8 (System.Collections.Generic.Dictionary`2[[Cassette.Bundle, Cassette],[Cassette.Utilities.HashedSet`1[[Cassette.Bundle, Cassette]], Cassette]])
!!  initializationException:000000f441dd5ed8 (System.Threading.ThreadAbortException)
    Changed:000000f240dbc3c8 (System.EventHandler`1[[Cassette.BundleCollectionChangedEventArgs, Cassette]])

See the initializationException field inside the output? It's of type ThreadAbortException!

My guess is the IIS app pool was recycled, kicked, thread-aborted, whatever and now the looping starts. Especially in conjunction with the implementation of ExceptionCatchingBundleCollectionInitializer: I am wondering why there is no more concrete exception type matching in the try-catch block of its Initialize method?

@mhoyer
Copy link
Author

mhoyer commented Sep 19, 2014

Side notes

Related sources

@joekrill
Copy link

I, too, am having this problem. It is extremely sporadic, and I haven't been able to figure out how to consistently reproduce it at will. But when it DOES happen, it continues to happen consistently for a while thereafter (that is, it's not just a random one-off thing every so often -- once it occurs, it continues occuring). I usually try things like: deleting the Cassette cache, rebuilding my project, modifying the web.config settings... and at some point it will start working again. But I can't find one particular thing that seems to fix the problem.

One thing that seems to instigate this issue more regularly is modifying or adding files to the file system. For example, I'm working with ReactJS and have a weather component. I dropped in a "weather-icons" directory where my script is, which contains some CSS and font files. And that caused cassette to blow up with this StackOverflowException. I've also experienced it after just modifying some of the javascript files themselves.

I'll continue to investigate but if anyone has any ideas -- or something more specific -- I'm happy to test or provide more feedback.

Some details about my setup:

  • Cassette v 2.4.2.13844 (using Cassette.Aspnet package from Nuget)
  • I'm building on top of Kentico CMS, which uses Webforms.

mhoyer pushed a commit to mhoyer/cassette that referenced this issue Sep 29, 2015
…llect general exceptions and rethrow ThreadAbortExceptions immediately
@kamranayub
Copy link
Contributor

Perhaps related perhaps not, but I was just experiencing an issue I traced down to Cassette which has similar symptoms--except it was EVERY request that used Cassete was causing a stack overflow and crashing my w3wp process. The cause was I was missing JS files when my project built in Azure, so references (I assume) were missing and Cassette was having some trouble. The symptom I saw was 502 Bad Gateway errors on every request that included a reference to a Cassette bundle. My MVC actions that didn't return a full view worked fine and once I discovered the cause (missing JS files), adding them back fixed it.

Now that I know what causes it, I could probably try reproducing it locally to see what Cassette is doing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants