Hi,
So our content authors and testers started reporting that they are getting more and more Page Unresponsive Error dialog appear while working on CM server. They also reported that there were often long waits during edit and save operations.
So, I started investigating this and jumped into Sitecore logs to dig out the root cause. I had interesting observations -
- Created pages in Experience editor and did not
encounter any issues during normal editing actions.
- Created a copy of medium sized folder in site (e.g. /sitecore/content/<SiteName>/<Test Folder Path> and
recycled it later. On both the actions, Page Unresponsive error was
reproducible and all subsequent requests from Sitecore client kept
waiting. This confirmed a large copy/delete/publish operation consumes most
of the resources available for SQL operations and other users
get blocked.
- Analyzed the logs for last few weeks and found that the
following activities happened around same timestamp whenever the
performance issue on CM was reported -
- Item publish
- Index rebuild/updates
- User session log outs
- Long running operations in Sitecore e.g. logs like –
- DEBUG Long running operation: Running Validation
Rules
- DEBUG Long running operation:
GetContentEditorWarningsArgs pipeline
- DEBUG Long running operation: renderContentEditor
pipeline[id={11111111-1111-1111-1111-111111111111}]
- DEBUG Long running operation: getChromeData pipeline
- Unnecessary Jobs running in background e.g.
IndexingStateSwitcher running every 1 minute
Potential
Causes and fixes-
- Many of our custom indexes working on master database used syncMaster strategy for index
rebuild/update. SyncMaster strategy re-indexes updated data immediately after various events in CM.
It adds advantage of almost real time indexing but syncMaster is the most
expensive indexing strategy in terms of machine resources and should only
be used in limited circumstances. Any large or frequent item operations in
CM will impact CM performance. We shall limit syncMaster to limited
indexes only and use strategies like intervalAsync strategy for other
indexes.
- Long running operations reported in Sitecore logs need
analysis. One of the most occurring long running operation is - Running
Validation Rules.
- Sitecore jobs running in background need tuning and
unnecessary jobs can be killed.
- It appears that during actions like publishing, index updates or bulk item processing, the CPU processes keep waiting for SQL operations to complete and CPU utilization shoots to 85% plus which causes page unresponsive issue. We may have to increase SQL resources.
Comments
Post a Comment