We originally thought the problem – what appeared as a memory leak – was due to a search component we were using. Under load the application would run the CPU at 100% and eat 2-3 MB of memory per-second. This would continue until the memory reach about 700 MB and then the application would attempt to recycle and start the whole cycle over again.
We’re now fairly confident we found the bug, it’s a bug which is only obvious under significant load. Simulating the type and pattern of load on a site as highly trafficked as www.asp.net is not as simple as it sounds. A few people questioned whether or not we even tested the site before we put it into production, which of course we had.
The bug was a in a couple of lines of code for a custom component on www.asp.net. On a positive note this was not Community Server code or Lucene (search code), but custom code written specifically for the CMS system that drives www.asp.net. I won’t go into the specifics, but the bug had to do with loading and parsing an XML document.
We’re definitely very sorry about the down time this caused, but we still want to keep everyone up-to-date with what we’ve found out.