Thursday, March 24, 2011

Using WinDbg to troubleshoot high CPU in an ASP.Net MVC app

Last week, I received an email:
We noticed one rendering server had 85% proc utilization all morning. We asked ops to take a memory dump before doing the IIS reset. Can you help us take a look at it?
While there are lots of resources on debugging high CPU in ASP.Net applications, I thought I’d list out the specific steps I took to debug this particular case.
Let’s dig in! Here’s how we figured out what was causing the issue:
Our ops team used DebugDiag to capture the dump. Since a full mini dump is all of the memory in the process space, the file is the same size as the private bytes in the process. In this case the dump was right around 5GB, so I copied it locally first :)
Load up the memory dump in WinDbg. We use 64-bit Windows 2008 R2 Servers, so we’ll use the 64-bit version of WinDbg. You can download WinDbg via the Debugging Tools for Windows package here. To open the dump, click File -> Open Crash Dump.
Load the psscor2 extension. You can get PSSCor2 here. PSSCor2 is like the SOS extension on steroids.
.load psscor2
First, I want to look at how much time threads are spending in user time. Without any additional parameters, the !runaway command will show how much time each thread has spent executing user mode code. You can also use other parameters to show kernel time and thread lifetime. In this case, I’m only interested in user time.
0:061> !runaway
 User Mode Time
  Thread       Time
  61:17c8      0 days 5:30:25.387
  63:174c      0 days 5:19:15.159
  60:1b08      0 days 5:15:17.742
  64:ad0       0 days 5:12:06.219
  69:1e50      0 days 5:07:28.600
  70:1dbc      0 days 5:06:42.720
  67:fe8       0 days 5:05:01.179
  65:18cc      0 days 5:03:09.576
  62:1f14      0 days 5:02:49.701
  66:6d4       0 days 4:59:53.997
  73:a10       0 days 4:50:38.727
Turns out we have 37 threads that have significantly higher user mode time than the other process threads. That’s suspicious. Let’s see what those threads are spending so much time doing. First, we switch to one of the interesting threads, then look at the call stack. The leftmost number is the thread id, and we use the syntax ~ s to switch threads.
0:000> ~61s
*** WARNING: Unable to verify checksum for
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa3f:
000007fe`ef94d65f 413bc5          cmp     eax,r13d
OK, so we see here we’re finding an entry in a dictionary. Switching to some of the other interesting threads, we see similar info:
0:061> ~63s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfab0:
000007fe`ef94d6d0 488b5710        mov     rdx,qword ptr [rdi+10h] ds:00000001`b2c3d030=0000000183fed728

0:063> ~60s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa2d:
000007fe`ef94d64d 483bc8          cmp     rcx,rax

0:060> ~64s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa26:
000007fe`ef94d646 4863cb          movsxd  rcx,ebx
That looks pretty suspicious. All four threads we’ve looked at are all in the same place in the call stack. Further validation of the call stack by using
and looking at the additional threads confirm the suspicion. Something weird is going on: all of our threads are getting hung up in the same place. In a previous investigation, we found that we had a runaway iterator that caused us problems. Looking at the call stack and the source code, it doesn’t appear to be the same problem.
The next step was to find out if the requests that triggered the issue were at all related. Let’s find out what url was being served. We’ll start back at the first thread. We use !clrstack -p to show the parameters that are available at each level in the stack. I’m looking for a method where I can get at the context objects. I’ve trimmed the output of the command here, but this is the method in the stack I focused in on:
000000000aa2e240 000007feec2a9efd System.Web.Mvc.ControllerActionInvoker.InvokeAction(System.Web.Mvc.ControllerContext, System.String)
        this = 0x000000011f866de0
        controllerContext = 0x000000011f866bf0
        actionName = [no data]
I have a controllerContext parameter that I can inspect and get to the original url. I use the address (0x000000011f866bf0) to get it’s contents.
Now comes the rinse and repeat part.
First, we dump the controllerContext object using it’s address:
0:061> !do 0x000000011f866bf0
Name: System.Web.Mvc.ControllerContext
MethodTable: 000007feec312348
EEClass: 000007feec1c5290
Size: 48(0x30) bytes
GC Generation: 2
              MT            Field           Offset                 Type VT             Attr            Value Name
000007fefae0e558  4000003        8 ...b.HttpContextBase  0 instance 000000011f8611a8 _httpContext
000007fef9c77070  4000004       10  0 instance 000000011f861468 _requestContext
000007fef9c771f0  4000005       18 ...Routing.RouteData  0 instance 000000011f861238 _routeData
000007feec3198b8  4000006       20  0 instance 000000011f866b68 [Controller] k__BackingField
Then dump the _httpContext (with it’s address of 000000011f8611a8):
0:061> !do 000000011f8611a8
Name: System.Web.HttpContextWrapper
MethodTable: 000007fefae0e728
EEClass: 000007fefade2248
Size: 24(0x18) bytes
GC Generation: 2
              MT            Field           Offset                 Type VT             Attr            Value Name
000007fee7622488  4000004        8 ...m.Web.HttpContext  0 instance 000000010fa0aa58 _context
Then dump the _context (address of 000000010fa0aa58):
0:061> !do 000000010fa0aa58
Name: System.Web.HttpContext
MethodTable: 000007fee7622488
EEClass: 000007fee72823b8
Size: 336(0x150) bytes
GC Generation: 2
              MT            Field           Offset                 Type VT             Attr            Value Name
000007fee7624fb0  4000fe6        8 ...IHttpAsyncHandler  0 instance 0000000000000000 _asyncAppHandler
000007fee7624ae0  4000fe7       10 ...b.HttpApplication  0 instance 0000000190ad1620 _appInstance
000007fee7625020  4000fe8       18 ....Web.IHttpHandler  0 instance 000000011f861488 _handler
000007fee7625460  4000fe9       20 ...m.Web.HttpRequest  0 instance 000000010fa0aba8 _request
000007fee76257e0  4000fea       28 ....Web.HttpResponse  0 instance 000000010fa0acf8 _response
000007fee7625cf8  4000feb       30 ...HttpServerUtility  0 instance 0000000000000000 _server
000007feefa71020  4000fec       38 ...Collections.Stack  0 instance 0000000000000000 _traceContextStack
000007fee762b060  4000fed       40 ....Web.TraceContext  0 instance 0000000000000000 _topTraceContext
000007feefa7f3d8  4000fee       48 ...ections.Hashtable  0 instance 000000011f8614a8 _items
000007feefa7e968  4000fef       50 ...ections.ArrayList  0 instance 0000000000000000 _errors
000007feefa77d48  4000ff0       58     System.Exception  0 instance 0000000000000000 _tempError
000007feefa76c50  4000ff1      118       System.Boolean  1 instance                0 _errorCleared
000007feefa929d8  4000ff2       60 ...ncipal.IPrincipal  0 instance 00000001a06e2608 _user
000007feefa7a488  4000ff3      100        System.IntPtr  1 instance 0 _pManagedPrincipal
000007fee7624758  4000ff4       68 ...ofile.ProfileBase  0 instance 0000000000000000 _Profile
000007feefab7fb8  4000ff5      128      System.DateTime  1 instance 000000010fa0ab80 _utcTimestamp
000007fee762a168  4000ff6       70 ...HttpWorkerRequest  0 instance 000000010fa0a870 _wr
000007feefa73850  4000ff7      130 ...Services.GCHandle  1 instance 000000010fa0ab88 _root
000007feefa7a488  4000ff8      108        System.IntPtr  1 instance 1877089392 _ctxPtr
000007fee761c5c0  4000ff9       78 ...m.Web.VirtualPath  0 instance 0000000000000000 _configurationPath
000007feefa76c50  4000ffa      119       System.Boolean  1 instance                0 _skipAuthorization
000007feefa80400  4000ffb       80 ...ation.CultureInfo  0 instance 000000019fc42e18 _dynamicCulture
000007feefa80400  4000ffc       88 ...ation.CultureInfo  0 instance 000000019fc42e18 _dynamicUICulture
000007feefa7ecf0  4000ffd      110         System.Int32  1 instance                0 _serverExecuteDepth
000007feefa71020  4000ffe       90 ...Collections.Stack  0 instance 0000000000000000 _handlerStack
000007feefa76c50  4000fff      11a       System.Boolean  1 instance                0 _preventPostback
000007feefa76c50  4001000      11b       System.Boolean  1 instance                0 _runtimeErrorReported
000007feefa76c50  4001001      11c       System.Boolean  1 instance                1 _firstNotificationInitCalled
000007feefab7fb8  4001002      138      System.DateTime  1 instance 000000010fa0ab90 _timeoutStartTime
000007feefa76c50  4001003      11d       System.Boolean  1 instance                1 _timeoutSet
000007feefab7eb8  4001004      140      System.TimeSpan  1 instance 000000010fa0ab98 _timeout
000007feefa7ecf0  4001005      114         System.Int32  1 instance                0 _timeoutState
000007fee7632540  4001006       98 ...b.Util.DoubleLink  0 instance 000000011f861168 _timeoutLink
000007feefa78520  4001007       a0 ....Threading.Thread  0 instance 00000001900da878 _thread
000007fee761e9e0  4001008       a8 ...eb.CachedPathData  0 instance 0000000000000000 _configurationPathData
000007fee761e9e0  4001009       b0 ...eb.CachedPathData  0 instance 000000011f866498 _filePathData
000007feefa77a80  400100a       b8        System.String  0 instance 0000000000000000 _sqlDependencyCookie
000007fee762e7b0  400100b       c0 ...essionStateModule  0 instance 0000000000000000 _sessionStateModule
000007fee7595230  400100c       c8 ...I.TemplateControl  0 instance 0000000000000000 _templateControl
000007fee75f6028  400100d       d0 ...tificationContext  0 instance 000000011f866aa0 _notificationContext
000007feefa76c50  400100e      11e       System.Boolean  1 instance                1 _isAppInitialized
000007feefa76c50  400100f      11f       System.Boolean  1 instance                1 _isIntegratedPipeline
000007feefa76c50  4001010      120       System.Boolean  1 instance                0 _finishPipelineRequestCalled
000007feefa76c50  4001011      121       System.Boolean  1 instance                0 _impersonationEnabled
000007feefa76c50  4001012      122       System.Boolean  1 instance                0 HideRequestResponse
000007feefa76c50  4001013      123       System.Boolean  1 instance                1 InIndicateCompletion
000007fee76262d8  4001014       d8 ...ion+ThreadContext  0 instance 000000011f861070 IndicateCompletionContext
000007fee7632410  4001015       e0 ...ronizationContext  0 instance 000000011f861038 _syncContext
000007feefa76c50  4001016      124       System.Boolean  1 instance                1 RequiresSessionState
000007feefa76c50  4001017      125       System.Boolean  1 instance                0 ReadOnlySessionState
000007feefa76c50  4001018      126       System.Boolean  1 instance                0 InAspCompatMode
000007fee7625020  4001019       e8 ....Web.IHttpHandler  0 instance 0000000000000000 _remapHandler
000007fee7625020  400101a       f0 ....Web.IHttpHandler  0 instance 0000000000000000 _currentHandler
000007feefa76c50  400101b      127       System.Boolean  1 instance                1 _ProfileDelayLoad
000007fee762ac68  400101c       f8 ...ielessHelperClass  0 instance 000000010fa0ae38 _CookielessHelper
000007feefa7a540  4000fe5      1a0 ...flection.Assembly  0   shared           static SystemWebAssembly
                                 >> Domain:Value  00000000019cba00:NotInit  000000001c71bae0:NotInit  000000001116c850:NotInit  000000001116d160:NotInit  000000000c4ec610:NotInit  <<
Next is the _request (address 000000010fa0aba8):
0:061> !do 000000010fa0aba8
Name: System.Web.HttpRequest
MethodTable: 000007fee7625460
EEClass: 000007fee7283cb0
Size: 336(0x150) bytes
GC Generation: 2
              MT            Field           Offset                 Type VT             Attr            Value Name
000007fee762a168  400107a        8 ...HttpWorkerRequest  0 instance 000000010fa0a870 _wr
000007fee7622488  400107b       10 ...m.Web.HttpContext  0 instance 000000010fa0aa58 _context
000007feefa77a80  400107c       18        System.String  0 instance 000000010fa0af78 _httpMethod
000007fee75de9a8  400107d      130         System.Int32  1 instance                2 _httpVerb
000007feefa77a80  400107e       20        System.String  0 instance 0000000000000000 _requestType
000007fee761c5c0  400107f       28 ...m.Web.VirtualPath  0 instance 000000011f861790 _path
000007feefa77a80  4001080       30        System.String  0 instance 000000010fa0a960 _rewrittenUrl
000007feefa76c50  4001081      138       System.Boolean  1 instance                1 _computePathInfo
000007fee761c5c0  4001082       38 ...m.Web.VirtualPath  0 instance 000000011f861790 _filePath
000007fee761c5c0  4001083       40 ...m.Web.VirtualPath  0 instance 0000000000000000 _currentExecutionFilePath
000007fee761c5c0  4001084       48 ...m.Web.VirtualPath  0 instance 0000000000000000 _pathInfo
000007feefa77a80  4001085       50        System.String  0 instance 00000001cf560370 _queryStringText
000007feefa76c50  4001086      139       System.Boolean  1 instance                1 _queryStringOverriden
000007feefa7fac0  4001087       58        System.Byte[]  0 instance 0000000000000000 _queryStringBytes
000007feefa77a80  4001088       60        System.String  0 instance 0000000000000000 _pathTranslated
000007feefa77a80  4001089       68        System.String  0 instance 00000001cf560370 _contentType
000007feefa7ecf0  400108a      134         System.Int32  1 instance               -1 _contentLength
000007feefa77a80  400108b       70        System.String  0 instance 0000000000000000 _clientTarget
000007feefa65870  400108c       78      System.Object[]  0 instance 0000000000000000 _acceptTypes
000007feefa65870  400108d       80      System.Object[]  0 instance 0000000000000000 _userLanguages
000007fee7591688  400108e       88 ...owserCapabilities  0 instance 000000019343ea50 _browsercaps
000007feeef18010  400108f       90           System.Uri  0 instance 000000011f868f78 _url
000007feeef18010  4001090       98           System.Uri  0 instance 000000011f8691b0 _referrer
000007fee7633598  4001091       a0 ...b.HttpInputStream  0 instance 0000000000000000 _inputStream
000007fee75e2db8  4001092       a8 ...ClientCertificate  0 instance 0000000000000000 _clientCertificate
000007feefa66e58  4001093       b0 ...l.WindowsIdentity  0 instance 0000000000000000 _logonUserIdentity
000007fee7591070  4001094       b8 ...tpValueCollection  0 instance 0000000000000000 _params
000007fee7591070  4001095       c0 ...tpValueCollection  0 instance 000000011f868248 _queryString
000007fee7591070  4001096       c8 ...tpValueCollection  0 instance 000000011f867418 _form
000007fee76325e0  4001097       d0 ...pHeaderCollection  0 instance 000000011f8693f8 _headers
000007fee7591200  4001098       d8 ...verVarsCollection  0 instance 0000000000000000 _serverVariables
000007fee762fa90  4001099       e0 ...pCookieCollection  0 instance 000000011f861808 _cookies
000007fee75c5068  400109a       e8 ...ttpFileCollection  0 instance 000000011f868500 _files
000007fee7632b78  400109b       f0 ...awUploadedContent  0 instance 0000000000000000 _rawContent
000007feefa76c50  400109c      13a       System.Boolean  1 instance                0 _readEntityBody
000007feefa65870  400109d       f8      System.Object[]  0 instance 0000000000000000 _multipartContentElements
000007feefa7f8d0  400109e      100 System.Text.Encoding  0 instance 0000000000000000 _encoding
000007fee75ba038  400109f      108 ...treamFilterSource  0 instance 0000000000000000 _filterSource
000007feefa7aa30  40010a0      110     System.IO.Stream  0 instance 0000000000000000 _installedFilter
000007fee7590a70  40010a1      140 ...SimpleBitVector32  1 instance 000000010fa0ace8 _flags
000007feefa77a80  40010a4      118        System.String  0 instance 0000000000000000 _AnonymousId
000007fee761c5c0  40010a5      120 ...m.Web.VirtualPath  0 instance 000000011f861790 _clientFilePath
000007fee761c5c0  40010a6      128 ...m.Web.VirtualPath  0 instance 0000000000000000 _clientBaseDir
000007feefa77370  40010a2      1b0        System.Object  0   shared           static s_browserLock
                                 >> Domain:Value  00000000019cba00:NotInit  000000001c71bae0:00000001a0b8cf00 000000001116c850:00000000ff583bb8 000000001116d160:000000015f6666b0 000000000c4ec610:00000001bf8562c8 <<
000007feefa76c50  40010a3      808       System.Boolean  1   shared           static s_browserCapsEvaled
                                 >> Domain:Value  00000000019cba00:NotInit  000000001c71bae0:1 000000001116c850:1 000000001116d160:1 000000000c4ec610:1 <<
And finally the _rewrittenUrl (address 000000010fa0a960):
0:061> !do 000000010fa0a960
Name: System.String
MethodTable: 000007feefa77a80
EEClass: 000007feef67e530
Size: 112(0x70) bytes
GC Generation: 2
String: /Rendering/id/41963766/as/html40/site/msnbc
              MT            Field           Offset                 Type VT             Attr            Value Name
000007feefa7ecf0  4000096        8         System.Int32  1 instance               44 m_arrayLength
000007feefa7ecf0  4000097        c         System.Int32  1 instance               43 m_stringLength
000007feefa794c8  4000098       10          System.Char  1 instance               2f m_firstChar
000007feefa77a80  4000099       20        System.String  0   shared           static Empty
                                 >> Domain:Value  00000000019cba00:00000001cf560370 000000001c71bae0:00000001cf560370 000000001116c850:00000001cf560370 000000001116d160:00000001cf560370 000000000c4ec610:00000001cf560370 <<
000007feefa79378  400009a       28        System.Char[]  0   shared           static WhitespaceChars
                                 >> Domain:Value  00000000019cba00:00000001cf560ac0 000000001c71bae0:00000001bfb9ca78 000000001116c850:000000019f587be0 000000001116d160:000000019fc42db0 000000000c4ec610:00000001ef5fb9e0 <<
Now I can see the url that was being processed by this thread (in the output labeled “String”). I ran the same commands on the other interesting threads and noticed that all of them were processing the same request url. Using the id in the url, we were able to identify a problem with the data for that document. We fixed the data and changed the code to remove the possibility of the problem arising in the future.
WinDbg is a really powerful tool. It can look intimidating at first, but once you get familiar with just a handful of commands, you can effectively debug 80% of the issues you’d come across.
More resources:
Tess Ferrandez’s Blog
Easy WinDbg

No comments:

Post a Comment