PK,While there are lots of resources on debugging high CPU in ASP.Net applications, I thought I’d list out the specific steps I took to debug this particular case.
We noticed one rendering server had 85% proc utilization all morning. We asked ops to take a memory dump before doing the IIS reset. Can you help us take a look at it?
Let’s dig in! Here’s how we figured out what was causing the issue:
Our ops team used DebugDiag to capture the dump. Since a full mini dump is all of the memory in the process space, the file is the same size as the private bytes in the process. In this case the dump was right around 5GB, so I copied it locally first :)
Load up the memory dump in WinDbg. We use 64-bit Windows 2008 R2 Servers, so we’ll use the 64-bit version of WinDbg. You can download WinDbg via the Debugging Tools for Windows package here. To open the dump, click File -> Open Crash Dump.
Load the psscor2 extension. You can get PSSCor2 here. PSSCor2 is like the SOS extension on steroids.
.load psscor2
First, I want to look at how much time threads are spending in user time. Without any additional parameters, the !runaway command will show how much time each thread has spent executing user mode code. You can also use other parameters to show kernel time and thread lifetime. In this case, I’m only interested in user time.0:061> !runaway
User Mode Time
Thread Time
61:17c8 0 days 5:30:25.387
63:174c 0 days 5:19:15.159
60:1b08 0 days 5:15:17.742
64:ad0 0 days 5:12:06.219
69:1e50 0 days 5:07:28.600
70:1dbc 0 days 5:06:42.720
67:fe8 0 days 5:05:01.179
65:18cc 0 days 5:03:09.576
62:1f14 0 days 5:02:49.701
66:6d4 0 days 4:59:53.997
73:a10 0 days 4:50:38.727
Turns out we have 37 threads that have significantly higher user mode time than the other process threads. That’s suspicious. Let’s see what those threads are spending so much time doing. First, we switch to one of the interesting threads, then look at the call stack. The leftmost number is the thread id, and we use the syntax ~ s to switch threads.0:000> ~61s
*** WARNING: Unable to verify checksum for mscorlib.ni.dll
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa3f:
000007fe`ef94d65f 413bc5 cmp eax,r13d
OK, so we see here we’re finding an entry in a dictionary. Switching to some of the other interesting threads, we see similar info:0:061> ~63s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfab0:
000007fe`ef94d6d0 488b5710 mov rdx,qword ptr [rdi+10h] ds:00000001`b2c3d030=0000000183fed728
0:063> ~60s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa2d:
000007fe`ef94d64d 483bc8 cmp rcx,rax
0:060> ~64s
mscorlib_ni!System.Collections.Generic.Dictionary`2[[System.IntPtr, mscorlib],[System.Int16, mscorlib]].FindEntry(IntPtr)+0xffffffff`ff9bfa26:
000007fe`ef94d646 4863cb movsxd rcx,ebx
That looks pretty suspicious. All four threads we’ve looked at are all in the same place in the call stack. Further validation of the call stack by using!clrstack
and looking at the additional threads confirm the suspicion. Something weird is going on: all of our threads are getting hung up in the same place. In a previous investigation, we found that we had a runaway iterator that caused us problems. Looking at the call stack and the source code, it doesn’t appear to be the same problem. The next step was to find out if the requests that triggered the issue were at all related. Let’s find out what url was being served. We’ll start back at the first thread. We use !clrstack -p to show the parameters that are available at each level in the stack. I’m looking for a method where I can get at the context objects. I’ve trimmed the output of the command here, but this is the method in the stack I focused in on:
000000000aa2e240 000007feec2a9efd System.Web.Mvc.ControllerActionInvoker.InvokeAction(System.Web.Mvc.ControllerContext, System.String)
PARAMETERS:
this = 0x000000011f866de0
controllerContext = 0x000000011f866bf0
actionName = [no data]
I have a controllerContext parameter that I can inspect and get to the original url. I use the address (0x000000011f866bf0) to get it’s contents. Now comes the rinse and repeat part.
First, we dump the controllerContext object using it’s address:
0:061> !do 0x000000011f866bf0
Name: System.Web.Mvc.ControllerContext
MethodTable: 000007feec312348
EEClass: 000007feec1c5290
Size: 48(0x30) bytes
GC Generation: 2
(C:\Windows\assembly\GAC_MSIL\System.Web.Mvc\2.0.0.0__31bf3856ad364e35\System.Web.Mvc.dll)
Fields:
MT Field Offset Type VT Attr Value Name
000007fefae0e558 4000003 8 ...b.HttpContextBase 0 instance 000000011f8611a8 _httpContext
000007fef9c77070 4000004 10 ...ng.RequestContext 0 instance 000000011f861468 _requestContext
000007fef9c771f0 4000005 18 ...Routing.RouteData 0 instance 000000011f861238 _routeData
000007feec3198b8 4000006 20 ...vc.ControllerBase 0 instance 000000011f866b68 [Controller] k__BackingField
Then dump the _httpContext (with it’s address of 000000011f8611a8):0:061> !do 000000011f8611a8
Name: System.Web.HttpContextWrapper
MethodTable: 000007fefae0e728
EEClass: 000007fefade2248
Size: 24(0x18) bytes
GC Generation: 2
(C:\Windows\assembly\GAC_MSIL\System.Web.Abstractions\3.5.0.0__31bf3856ad364e35\System.Web.Abstractions.dll)
Fields:
MT Field Offset Type VT Attr Value Name
000007fee7622488 4000004 8 ...m.Web.HttpContext 0 instance 000000010fa0aa58 _context
Then dump the _context (address of 000000010fa0aa58):0:061> !do 000000010fa0aa58
Name: System.Web.HttpContext
MethodTable: 000007fee7622488
EEClass: 000007fee72823b8
Size: 336(0x150) bytes
GC Generation: 2
(C:\Windows\assembly\GAC_64\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
000007fee7624fb0 4000fe6 8 ...IHttpAsyncHandler 0 instance 0000000000000000 _asyncAppHandler
000007fee7624ae0 4000fe7 10 ...b.HttpApplication 0 instance 0000000190ad1620 _appInstance
000007fee7625020 4000fe8 18 ....Web.IHttpHandler 0 instance 000000011f861488 _handler
000007fee7625460 4000fe9 20 ...m.Web.HttpRequest 0 instance 000000010fa0aba8 _request
000007fee76257e0 4000fea 28 ....Web.HttpResponse 0 instance 000000010fa0acf8 _response
000007fee7625cf8 4000feb 30 ...HttpServerUtility 0 instance 0000000000000000 _server
000007feefa71020 4000fec 38 ...Collections.Stack 0 instance 0000000000000000 _traceContextStack
000007fee762b060 4000fed 40 ....Web.TraceContext 0 instance 0000000000000000 _topTraceContext
000007feefa7f3d8 4000fee 48 ...ections.Hashtable 0 instance 000000011f8614a8 _items
000007feefa7e968 4000fef 50 ...ections.ArrayList 0 instance 0000000000000000 _errors
000007feefa77d48 4000ff0 58 System.Exception 0 instance 0000000000000000 _tempError
000007feefa76c50 4000ff1 118 System.Boolean 1 instance 0 _errorCleared
000007feefa929d8 4000ff2 60 ...ncipal.IPrincipal 0 instance 00000001a06e2608 _user
000007feefa7a488 4000ff3 100 System.IntPtr 1 instance 0 _pManagedPrincipal
000007fee7624758 4000ff4 68 ...ofile.ProfileBase 0 instance 0000000000000000 _Profile
000007feefab7fb8 4000ff5 128 System.DateTime 1 instance 000000010fa0ab80 _utcTimestamp
000007fee762a168 4000ff6 70 ...HttpWorkerRequest 0 instance 000000010fa0a870 _wr
000007feefa73850 4000ff7 130 ...Services.GCHandle 1 instance 000000010fa0ab88 _root
000007feefa7a488 4000ff8 108 System.IntPtr 1 instance 1877089392 _ctxPtr
000007fee761c5c0 4000ff9 78 ...m.Web.VirtualPath 0 instance 0000000000000000 _configurationPath
000007feefa76c50 4000ffa 119 System.Boolean 1 instance 0 _skipAuthorization
000007feefa80400 4000ffb 80 ...ation.CultureInfo 0 instance 000000019fc42e18 _dynamicCulture
000007feefa80400 4000ffc 88 ...ation.CultureInfo 0 instance 000000019fc42e18 _dynamicUICulture
000007feefa7ecf0 4000ffd 110 System.Int32 1 instance 0 _serverExecuteDepth
000007feefa71020 4000ffe 90 ...Collections.Stack 0 instance 0000000000000000 _handlerStack
000007feefa76c50 4000fff 11a System.Boolean 1 instance 0 _preventPostback
000007feefa76c50 4001000 11b System.Boolean 1 instance 0 _runtimeErrorReported
000007feefa76c50 4001001 11c System.Boolean 1 instance 1 _firstNotificationInitCalled
000007feefab7fb8 4001002 138 System.DateTime 1 instance 000000010fa0ab90 _timeoutStartTime
000007feefa76c50 4001003 11d System.Boolean 1 instance 1 _timeoutSet
000007feefab7eb8 4001004 140 System.TimeSpan 1 instance 000000010fa0ab98 _timeout
000007feefa7ecf0 4001005 114 System.Int32 1 instance 0 _timeoutState
000007fee7632540 4001006 98 ...b.Util.DoubleLink 0 instance 000000011f861168 _timeoutLink
000007feefa78520 4001007 a0 ....Threading.Thread 0 instance 00000001900da878 _thread
000007fee761e9e0 4001008 a8 ...eb.CachedPathData 0 instance 0000000000000000 _configurationPathData
000007fee761e9e0 4001009 b0 ...eb.CachedPathData 0 instance 000000011f866498 _filePathData
000007feefa77a80 400100a b8 System.String 0 instance 0000000000000000 _sqlDependencyCookie
000007fee762e7b0 400100b c0 ...essionStateModule 0 instance 0000000000000000 _sessionStateModule
000007fee7595230 400100c c8 ...I.TemplateControl 0 instance 0000000000000000 _templateControl
000007fee75f6028 400100d d0 ...tificationContext 0 instance 000000011f866aa0 _notificationContext
000007feefa76c50 400100e 11e System.Boolean 1 instance 1 _isAppInitialized
000007feefa76c50 400100f 11f System.Boolean 1 instance 1 _isIntegratedPipeline
000007feefa76c50 4001010 120 System.Boolean 1 instance 0 _finishPipelineRequestCalled
000007feefa76c50 4001011 121 System.Boolean 1 instance 0 _impersonationEnabled
000007feefa76c50 4001012 122 System.Boolean 1 instance 0 HideRequestResponse
000007feefa76c50 4001013 123 System.Boolean 1 instance 1 InIndicateCompletion
000007fee76262d8 4001014 d8 ...ion+ThreadContext 0 instance 000000011f861070 IndicateCompletionContext
000007fee7632410 4001015 e0 ...ronizationContext 0 instance 000000011f861038 _syncContext
000007feefa76c50 4001016 124 System.Boolean 1 instance 1 RequiresSessionState
000007feefa76c50 4001017 125 System.Boolean 1 instance 0 ReadOnlySessionState
000007feefa76c50 4001018 126 System.Boolean 1 instance 0 InAspCompatMode
000007fee7625020 4001019 e8 ....Web.IHttpHandler 0 instance 0000000000000000 _remapHandler
000007fee7625020 400101a f0 ....Web.IHttpHandler 0 instance 0000000000000000 _currentHandler
000007feefa76c50 400101b 127 System.Boolean 1 instance 1 _ProfileDelayLoad
000007fee762ac68 400101c f8 ...ielessHelperClass 0 instance 000000010fa0ae38 _CookielessHelper
000007feefa7a540 4000fe5 1a0 ...flection.Assembly 0 shared static SystemWebAssembly
>> Domain:Value 00000000019cba00:NotInit 000000001c71bae0:NotInit 000000001116c850:NotInit 000000001116d160:NotInit 000000000c4ec610:NotInit <<
Next is the _request (address 000000010fa0aba8):0:061> !do 000000010fa0aba8
Name: System.Web.HttpRequest
MethodTable: 000007fee7625460
EEClass: 000007fee7283cb0
Size: 336(0x150) bytes
GC Generation: 2
(C:\Windows\assembly\GAC_64\System.Web\2.0.0.0__b03f5f7f11d50a3a\System.Web.dll)
Fields:
MT Field Offset Type VT Attr Value Name
000007fee762a168 400107a 8 ...HttpWorkerRequest 0 instance 000000010fa0a870 _wr
000007fee7622488 400107b 10 ...m.Web.HttpContext 0 instance 000000010fa0aa58 _context
000007feefa77a80 400107c 18 System.String 0 instance 000000010fa0af78 _httpMethod
000007fee75de9a8 400107d 130 System.Int32 1 instance 2 _httpVerb
000007feefa77a80 400107e 20 System.String 0 instance 0000000000000000 _requestType
000007fee761c5c0 400107f 28 ...m.Web.VirtualPath 0 instance 000000011f861790 _path
000007feefa77a80 4001080 30 System.String 0 instance 000000010fa0a960 _rewrittenUrl
000007feefa76c50 4001081 138 System.Boolean 1 instance 1 _computePathInfo
000007fee761c5c0 4001082 38 ...m.Web.VirtualPath 0 instance 000000011f861790 _filePath
000007fee761c5c0 4001083 40 ...m.Web.VirtualPath 0 instance 0000000000000000 _currentExecutionFilePath
000007fee761c5c0 4001084 48 ...m.Web.VirtualPath 0 instance 0000000000000000 _pathInfo
000007feefa77a80 4001085 50 System.String 0 instance 00000001cf560370 _queryStringText
000007feefa76c50 4001086 139 System.Boolean 1 instance 1 _queryStringOverriden
000007feefa7fac0 4001087 58 System.Byte[] 0 instance 0000000000000000 _queryStringBytes
000007feefa77a80 4001088 60 System.String 0 instance 0000000000000000 _pathTranslated
000007feefa77a80 4001089 68 System.String 0 instance 00000001cf560370 _contentType
000007feefa7ecf0 400108a 134 System.Int32 1 instance -1 _contentLength
000007feefa77a80 400108b 70 System.String 0 instance 0000000000000000 _clientTarget
000007feefa65870 400108c 78 System.Object[] 0 instance 0000000000000000 _acceptTypes
000007feefa65870 400108d 80 System.Object[] 0 instance 0000000000000000 _userLanguages
000007fee7591688 400108e 88 ...owserCapabilities 0 instance 000000019343ea50 _browsercaps
000007feeef18010 400108f 90 System.Uri 0 instance 000000011f868f78 _url
000007feeef18010 4001090 98 System.Uri 0 instance 000000011f8691b0 _referrer
000007fee7633598 4001091 a0 ...b.HttpInputStream 0 instance 0000000000000000 _inputStream
000007fee75e2db8 4001092 a8 ...ClientCertificate 0 instance 0000000000000000 _clientCertificate
000007feefa66e58 4001093 b0 ...l.WindowsIdentity 0 instance 0000000000000000 _logonUserIdentity
000007fee7591070 4001094 b8 ...tpValueCollection 0 instance 0000000000000000 _params
000007fee7591070 4001095 c0 ...tpValueCollection 0 instance 000000011f868248 _queryString
000007fee7591070 4001096 c8 ...tpValueCollection 0 instance 000000011f867418 _form
000007fee76325e0 4001097 d0 ...pHeaderCollection 0 instance 000000011f8693f8 _headers
000007fee7591200 4001098 d8 ...verVarsCollection 0 instance 0000000000000000 _serverVariables
000007fee762fa90 4001099 e0 ...pCookieCollection 0 instance 000000011f861808 _cookies
000007fee75c5068 400109a e8 ...ttpFileCollection 0 instance 000000011f868500 _files
000007fee7632b78 400109b f0 ...awUploadedContent 0 instance 0000000000000000 _rawContent
000007feefa76c50 400109c 13a System.Boolean 1 instance 0 _readEntityBody
000007feefa65870 400109d f8 System.Object[] 0 instance 0000000000000000 _multipartContentElements
000007feefa7f8d0 400109e 100 System.Text.Encoding 0 instance 0000000000000000 _encoding
000007fee75ba038 400109f 108 ...treamFilterSource 0 instance 0000000000000000 _filterSource
000007feefa7aa30 40010a0 110 System.IO.Stream 0 instance 0000000000000000 _installedFilter
000007fee7590a70 40010a1 140 ...SimpleBitVector32 1 instance 000000010fa0ace8 _flags
000007feefa77a80 40010a4 118 System.String 0 instance 0000000000000000 _AnonymousId
000007fee761c5c0 40010a5 120 ...m.Web.VirtualPath 0 instance 000000011f861790 _clientFilePath
000007fee761c5c0 40010a6 128 ...m.Web.VirtualPath 0 instance 0000000000000000 _clientBaseDir
000007feefa77370 40010a2 1b0 System.Object 0 shared static s_browserLock
>> Domain:Value 00000000019cba00:NotInit 000000001c71bae0:00000001a0b8cf00 000000001116c850:00000000ff583bb8 000000001116d160:000000015f6666b0 000000000c4ec610:00000001bf8562c8 <<
000007feefa76c50 40010a3 808 System.Boolean 1 shared static s_browserCapsEvaled
>> Domain:Value 00000000019cba00:NotInit 000000001c71bae0:1 000000001116c850:1 000000001116d160:1 000000000c4ec610:1 <<
And finally the _rewrittenUrl (address 000000010fa0a960):0:061> !do 000000010fa0a960
Name: System.String
MethodTable: 000007feefa77a80
EEClass: 000007feef67e530
Size: 112(0x70) bytes
GC Generation: 2
(C:\Windows\assembly\GAC_64\mscorlib\2.0.0.0__b77a5c561934e089\mscorlib.dll)
String: /Rendering/id/41963766/as/html40/site/msnbc
Fields:
MT Field Offset Type VT Attr Value Name
000007feefa7ecf0 4000096 8 System.Int32 1 instance 44 m_arrayLength
000007feefa7ecf0 4000097 c System.Int32 1 instance 43 m_stringLength
000007feefa794c8 4000098 10 System.Char 1 instance 2f m_firstChar
000007feefa77a80 4000099 20 System.String 0 shared static Empty
>> Domain:Value 00000000019cba00:00000001cf560370 000000001c71bae0:00000001cf560370 000000001116c850:00000001cf560370 000000001116d160:00000001cf560370 000000000c4ec610:00000001cf560370 <<
000007feefa79378 400009a 28 System.Char[] 0 shared static WhitespaceChars
>> Domain:Value 00000000019cba00:00000001cf560ac0 000000001c71bae0:00000001bfb9ca78 000000001116c850:000000019f587be0 000000001116d160:000000019fc42db0 000000000c4ec610:00000001ef5fb9e0 <<
Now I can see the url that was being processed by this thread (in the output labeled “String”). I ran the same commands on the other interesting threads and noticed that all of them were processing the same request url. Using the id in the url, we were able to identify a problem with the data for that document. We fixed the data and changed the code to remove the possibility of the problem arising in the future.WinDbg is a really powerful tool. It can look intimidating at first, but once you get familiar with just a handful of commands, you can effectively debug 80% of the issues you’d come across.
More resources:
Tess Ferrandez’s Blog
WinDbg.info
Easy WinDbg
No comments:
Post a Comment