Hi,
So our content authors and testers started reporting that they are getting more and more Page Unresponsive Error dialog appear while working on CM server. They also reported that there were often long waits during edit and save operations.
So, I started investigating this and jumped into Sitecore logs to dig out the root cause. I had interesting observations -
- Created pages in Experience editor and did not
     encounter any issues during normal editing actions.
 - Created a copy of medium sized folder in site (e.g. /sitecore/content/<SiteName>/<Test Folder Path> and
     recycled it later. On both the actions, Page Unresponsive error was
     reproducible and all subsequent requests from Sitecore client kept
     waiting. This confirmed a large copy/delete/publish operation consumes most
     of the resources available for SQL operations and other users
     get blocked. 
 - Analyzed the logs for last few weeks and found that the
     following activities happened around same timestamp whenever the
     performance issue on CM was reported -
 - Item publish 
 - Index rebuild/updates
 - User session log outs
 - Long running operations in Sitecore e.g. logs like –
 - DEBUG Long running operation: Running Validation
       Rules
 - DEBUG Long running operation:
       GetContentEditorWarningsArgs pipeline
 - DEBUG Long running operation: renderContentEditor
       pipeline[id={11111111-1111-1111-1111-111111111111}]
 - DEBUG Long running operation: getChromeData pipeline
 - Unnecessary Jobs running in background e.g.
      IndexingStateSwitcher running every 1 minute
 
Potential
Causes and fixes-
- Many of our custom indexes working on master database used syncMaster strategy for index
     rebuild/update. SyncMaster strategy re-indexes updated data immediately after various events in CM.
     It adds advantage of almost real time indexing but syncMaster is the most
     expensive indexing strategy in terms of machine resources and should only
     be used in limited circumstances. Any large or frequent item operations in
     CM will impact CM performance. We shall limit syncMaster to limited
     indexes only and use strategies like intervalAsync strategy for other
     indexes.
 - Long running operations reported in Sitecore logs need
     analysis. One of the most occurring long running operation is - Running
     Validation Rules.
 - Sitecore jobs running in background need tuning and
     unnecessary jobs can be killed.
 - It appears that during actions like publishing, index updates or bulk item processing, the CPU processes keep waiting for SQL operations to complete and CPU utilization shoots to 85% plus which causes page unresponsive issue. We may have to increase SQL resources.
 
Comments
Post a Comment