soumyasch Posted December 26, 2009 Share Posted December 26, 2009 For the last few days, I have been having random CPU spikes. The "System" process, about once every 1.5 to 2 minutes, takes up one entire core for about 10 seconds. Because it is the System process, I am suspecting it to be a driver (the most recent driver upgrades were installing Nvidia 195.62 laptop drivers (WHQL) and Realtek R239 HD Audio drivers, but I cannot lay the blame on an upgrade as I didn't notice exactly when the spiking started). I have ruled out other possible factors including malware and rootkits. Process Explorer narrows it down to a thread that starts at Ntkrnlpa.exe!KeInsertQueueDpc+0x275, but because the System process is a protected process, it can't access any more information, including the thread stack. There are several other threads that start at the same address but do not spike. How can I get more information about exactly what is causing the spikes or what function starts at KeInsertQueueDpc+0x275 or what execution stack it followed to cause the spikes? Any help in getting to the root of the problem is greatly appreciated. Link to comment Share on other sites More sharing options...
MagicAndre1981 Posted December 26, 2009 Share Posted December 26, 2009 Hi, follow my tutorial here: http://www.msfn.org/board/get-cause-high-c...pt-t140263.html Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Thanks for your suggestion. I have already used kernRates (using the symbols with Process Explorer wasn't helpful, as all it showed was that it was a thread from the threadpool, it couldn't show anything else as the System process is protected in Win7). I have two installations of Win7 on the same system (one Pro, the other Ultimate). The spiking occurs only in Pro. I ran the same workload (same running processes plus uTorrent and FDM with same config downloading the same file) in both environments for one hour, with kernrates running. Comparing the results showed that Ntfs.sys generated about 10% of the events in Pro, whereas it was ~0% in Ult. I have no idea whats triggering this behavior in Ntfs.sys in Pro. Re-running the profiler without the downloaders running also gives the same result. The other modules have caused more or less similar percentage of events. There isn't any disc thrashing occurring when there are CPU spikes (if it is of interest, regular filesystem tasks barely results in any CPU usage). So, it looks like the NTFS driver is repeatedly trying to do something but getting stuck in a loop without doing anything noticeable. Will try profiling again with xperf and let you know the results. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Poked around with XPerf. Doesn't look like its DPC issue. Processor usage due to DPC barely touches 3% max. Same is confirmed by Process Explorer, the DPCs counter barely moves up. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 But saw something else. Coinciding with the CPU spikes, interrupts also go up and file activity occurs. In the graph, the green line is the CPU usage for interrupts, the red one for the CPU utilization of first core and blue for the CPU utilization of the second core. And the bars indicate file activity. At the CPU spikes, all events are occurring. Looking into the CPU usage around the time of the spikes, sure enough the System process is spiking and except the kernel, the Ntfs.sys driver shows the most usage. The values are similar to the one traced by kernrates. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Looking into the details of file activity, there are three events the System process participated in. The huge towers for the file IO events occurred for the Create event. The total time the System process spent for Creating files is close to five seconds, which is about the same duration the CPU spikes last, and generated about 250,000 IO Request Packets, which explains the spikes in file IO. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Each of those 260,000 events are created by Thread with Id 48 of the System process, which is confirmed to the same thread that spikes by using Process Explorer to look into the threads' activity of the System process during the spike. Each of those events look same: File Name: \Device\HarddiskVolume2\Windows\System32\drivers\etc\lmhostsFlags: synchronous_io_nonalert Option24 normal shareRead shareWrite Result: Object Name not found. (0xc0000034) So it looks like its trying to create (or read?) the lmhosts file and failing. Sure the file isn't present in %windir%\System32\drivers\etc\; I do not use WINS. But why the hell is it trying to do the same for more than 250,000 times, when it has already failed once? And why is it doing this over and over again? I will try and create a dummy lmhosts file and see what happens, but that looks like a band-aid, not a solution. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Okay, since I created a blank lmhosts file, didn't have a spike. So, resolved with a band-aid, I think. Will keep an eye out for the next few hours. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Looks like the celebration was premature. I did a reboot, and the problem is now back. This time, even though the lmhosts file exists and the read succeeds, 150,000 requests are still issued. Link to comment Share on other sites More sharing options...
soumyasch Posted December 26, 2009 Author Share Posted December 26, 2009 Deleting and re-creating lmhosts fixed the problem, but it resurfaced on next restart. Dammit, I want a resolution. Don't ****ing care what the problem is anymore. I am taking a heavy hammer and disabling NetBIOS over TCP with brute force. No more NetBIOS, no more LanMan name resolution! Link to comment Share on other sites More sharing options...
MagicAndre1981 Posted December 26, 2009 Share Posted December 26, 2009 Ok, because you now know the cause, contact the MS support and tell them what you found out. you can code a small program which creates the empty file. Now run the program with task scheduler at every startup. So you have a workaround until MS fixed it. Link to comment Share on other sites More sharing options...
FallenDeku Posted January 1, 2010 Share Posted January 1, 2010 If anyone reading this topic has the same problem (I did), I've found a solution which appears to fix the problem for good Open the properties box for a network adapter, any will do Click TCP/IPv4 and then Properties Jump to the WINS tab De-select "Enable LMHOSTS lookup" When you OK out of it the setting is applied to all network adapters Hope this helps someone, spent most of my afternoon trying to find out what was going on Link to comment Share on other sites More sharing options...
JoeyX Posted January 29, 2012 Share Posted January 29, 2012 If anyone reading this topic has the same problem (I did), I've found a solution which appears to fix the problem for good Open the properties box for a network adapter, any will do Click TCP/IPv4 and then Properties Jump to the WINS tab De-select "Enable LMHOSTS lookup" When you OK out of it the setting is applied to all network adapters Hope this helps someone, spent most of my afternoon trying to find out what was going on It works! Thanks. Link to comment Share on other sites More sharing options...
Recommended Posts