January 30, 2012

Thoughts on Upcoming 1.4 Skyrim Patch

One theory I have for the Skyrim crashes that I and others are experiencing: Intel Speedstep and/or Core Parking powers down some cores prematurely, causing the Skyrim code to try and access a thread running on an affected core.  The core takes some time to power up, which the Skyrim code doesn't account for, which in turn causes the entire system to crash.  It's a theory based on observations, not necessarily in-depth knowledge of either the Skyrim code or how the technology works.  The observation is disabling Intel Speedstep and/or core parking (in other words, causing the cores to run at full speed all the time) increases stability (read: no crashing).  Of course there is the "not everyone is crashing" variable which may be due to the crashing systems being very fast (not everyone has the latest hardware), others having disabled core parking in the past (this affects older games as well) or having enough background processes to prevent the cores from powering down in the first place.  This is conjecture at this point, of course, but that's my theory in a nutshell.
The upcoming 1.4 patch should prove interesting.  One thing they supposedly fixed is optimizing the engine to use the hardware more efficiently.  This alone may fix the crashing problem.  There's definitely going to be a series of tests that I need to run after the patch comes out.

January 20, 2012

New Fangled Tech

Naturally, I'm researching Core Parking.  It's been around for a number of years, but during those same years, I've essentially "dropped out" of keeping up with technology.  My previous rig was a Dell with a dual-core (I forget which) and it worked fine for minor gaming.  It kept on working with newer games, so I never dug into the "new-fangled" stuff.  That system lasted me over five years.  Also, since the prices for components and custom machines have dropped significantly, actually building my own rig (or building for others) had become unreasonable: it's cheaper and easier to just order something that has what you need.  "Back in my day" you needed to build your own system if you wanted cutting edge technology without paying an arm and a leg (*ahem*Alienware*ahem*).  But with a new "custom built" rig that someone else (CyberpowerPC) built for me, I'm back waist-deep in researching the new-fangled technologies.

Not much has changed.  Well, a lot has changed tech wise, but the community seems to have stayed the same: just a few "serious" sites (Anandtech, Tom's Hardware, Overclockers.net), riddled with different groups of increasing population:

Very Few: The hard core geeks who can differentiate between different gate technology materials (I just made that up), drills holes into their CPUs and even solder on wires to get that last .5 Frame Per Second.  These guys will tear you a new one (by answering in detail or by dismissively giving a correct but unexpected answer) when you ask which processor is faster.

Minority: Somewhat serious folks who like to tinker, but are not drilling holes in CPUs (although when pushed, they will and then go into a depression).  They are up to speed with the latest marketing and specs and can generally work through tough issues, but also when to stop and accept a non-optimal solution.

Majority: Those who just want things to work and wander into various communities looking for help.  They eagerly want to solve their problem, but sometimes throw everyone off by asking how to edit the registry.

Of course, there are the usual nut-jobs as well and the angry ones who will rant and rave for pages and pages about how much they don't care about something.  These people tend to campaign for boycotts and demand something unreasonable, like a game company to call in programmers to fix their specific issue.

I like to think I fall in the minority because I know enough, but not enough

As I started with, I've been reading about Core Parking quite a bit and strongly suspect that this is the problem I (and others) might be having with Skyrim rebooting at random.  Consider this:

Note: my system has 2600k, Gigabyte MB, Gigabyte GTX 570, 16GB RAM

Known
  • Intermittent (sometimes after a minute, sometimes after an hour) hardware reset/crash/freeze.
  • Only with Skyrim
  • Works with other games
  • Works with Benchmarking tools (DX9, DX10, DX11) (over 7 hours running)
  • Works with stress testing tools (over 7 hours running)
  • Works with Benchmarking + Stress testing AT THE SAME TIME (Kombustor DX9 and DX10, 8X MSAA, 1920x1080, POST-FX, FULLSCREEN for about an hour)
  • Replaced Power Supply
  • Reinstalled Skyrim
  • Did the other things recommended by others on the 'net.
  • None of the above made a difference
  • No overheating
  • No error logs are generated - hardware just gets zapped
  • Unable to replicate rebooting/freezing outside of Skyrim
  • Disabling Speedstep made the system stable (one reboot in a week of playing for hours and hours a day).
  • Re-enabling Speedstep caused the crashes again.
  • Re-enabling Speedstep and disabling Core Parking seems stable (not fully tested yet).
  • Doing the above two seem to have worked for others (unsubstantiated reports only).
  • CPU/GPU load when playing Skyrim is very low on my system (settings are Ultra/maxed out).
  • Disabling Core Parking seems to be universally good for gaming (haven't found a negative post about it yet)
CPU Loads when running Skyrim

Unknown
  • Skyrim Creation Engine: is it built on Gamebryo? Why does it not use more cores?
  • Is Core Parking kicking in while playing Skyrim since the load is low?
  • Is the Skyrim engine somehow allergic to cores being powered down?
    • Skyrim works with Speedstep/C1E/C36/C6 State Support turned off (this prevents Core Parking)
    • Skyrim (seems to) work with Core Parking turned off and the above turned on
  • Out of the millions of Skyrim players, only a minority are having issues.  Is this because:
    • Only a few have high end processors + Windows 7 that support Core Parking?
    • Only a few have 16GB of RAM?
    • Only a few have low CPU/GPU use with Skyrim (most posts indicate that Skyrim is CPU intensive)?  This would mean my system is TOO FAST
    • Most have disabled Core Parking at some point in the past (I have a brand new box, nothing else loaded but games)?
    • Most have enough background processes to prevent Core Parking when playing Skyrim?
Anyway, some Core Parking things I found:
  • Works best in a server environment where there are cores that are truly not busy for long period of time
  • Does not save power if the system is under heavy load
  • Only in Windows 7, 8 and 2003, 2008
  • Impedes performance when enabled
  • Known to cause problems (Microsoft has a patch to disable it)

Testing to continue...

January 19, 2012

Skyrim Crash/Rebooting Fix: Disable Core Parking?

If you've read this blog before, you'll know that I've been having some issues with Skyrim on my new system.  Namely the problem was that Skyrim would cause the system to reboot intermittently.  It was very frustrating.  I had tried many, many things and the one that worked was disabling Speedstep.

However, this "solution" bugged me: it was not ideal and didn't explain the millions of other systems out there with Speedstep enabled that wasn't crashing.  In other words, it was not elegant or clean.  Of course, there are enough Internet posts that talk about the crashing, but certainly it is not as widespread as it should be if the problem was Speedstep.  Another problem was that my system was using power like it was still the 1990s!  This certainly goes against my green tendencies.  And then two days ago, I got yet another reboot in the middle of Skyrim.

So, I kept on searching when I stumbled across a series of posts and articles related to "Core Parking."  Having either never heard of this or assuming it's the same as Speedstep, I was surprised to learn that this was something that was found only the latest Intel processors (like mine) and is only found in the latest Windows OS (like mine).  I read through many many articles and it seemed this was indeed a likely culprit.

My theory goes like this: Skyrim's engine, the "Creation Engine" is not really new.  In fact, many others on the Internet postulate that it's simply built on the old engine, Gamebryo.  Gamebryo had some issues with Skyrim's predecessor, Oblivion.  Some research shows that the Creation Engine and Gamebryo share similar configuration files and in fact, the same behaviors.  Anyway, there's a lot to be said about Gamebryo vs. Creation that could fill pages and pages, but in short, I believe that "Creation" is a "new" engine that was built on the old one.  Gamebryo had problems with multi-core processors and it seems the Creation Engine also has at least some version of these problems.  Namely, not effectively using multi-cores and being single threaded.  In fact, some websites have shown that Skyrim does not run that much better with a quad-core vs a dual-core.  Something like 50% more performance from single-core to dual-core but only 3% from dual-core to quad-core.

Core Parking is essentially Windows deciding that if a core is not busy, it will power down that core to save energy.  If that core is needed for something, it wakes up in milliseconds.  My disabling Speedstep effectively prevented cores from being parked because they were running at full speed all the time anyway (although no load - it's quite possible that Core Parking relies on Speedstep to lower the core ratio/frequency and/or core voltage before deciding that the core is ready for 'parking').  At this point it's probably best to describe each:

Speedstep - this is a power-saving technology that reduces the frequency ratio of a core (mine goes from 34x to 16x @ 100MHz base, so 3.4GHz to 1.6GHz.  Additionally, it can reduce the voltage draw of a core as well.  The latest iteration also has a "Turbo" mode where the core can boost up to something like 42x (4.2GHz) if needed.  This is enabled/disabled in the BIOS.

Core Parking - also reduces power use, this basically disables a core.  This is a Windows thing.

Since I have a quad-core 2600K with multi-threading, Windows sees 8 processors.   During my test, when idle, with Speedstep on, Windows parked CPU1, CPU3, CPU5 and CPU6 and sometimes CPU7.  When I kicked up the load, the CPUs would get unparked and would kick in.

When Skyrim runs, it's not very processor intensive, at least not on my system.  The loads go up and down depending on what's happening.  Perhaps while in game, some cores are parked.  Perhaps combined with my 16GB of RAM and virtually nothing else running on my system, save for Steam, I'm not really needing a lot of cores while playing Skyrim.  My current theory is that neither the Gamebryo or the Creation Engine handles this very well with 4 cores and 8 "virtual processors."

I tested this by re-enabling all the defaults (i.e. Speedstep) and then disabling the Core Parking feature of Windows 7.  Note that there are many ways to do this.  I went with enabling the menu options and then setting minimum to 100% and maximum to 0% (see below).

Be sure to set "min cores" to 100% and "max cores" to 0%.
I recommend checking out the Microsoft way.  Anyway, it worked - I verified the Core Parking and Core Un-Parking effects using the built-in "Resource Manager" of Windows (scroll down for what it looks like).  Skyrim was played thoroughly for about and hour and DID NOT HAVE ANY CRASHES.  It's quite possible that this may be the "core issue" (pardon the pun).


It's still early in the testing, but at least I'm being somewhat green again, although not as green as when Core Parking is enabled.  But the power savings is minimal for me.  This is mostly intended for the huge server farms where some systems may stay idle for long stretches of time.

I'll give this at least a week of testing before I call it a win, but so far so good.  (I'll be out for a few weeks though, so no updates for a while.)   It's a nice discovery in any case and I learned way more about power savings than I intended to.  But now that I found it, I can't stop looking at it - kinda like the orb in Skyrim,

I found it, now I need to understand it before I can control it.

These are what parked cores look like in resource manager

After disabling core parking
Final note: After enabling the menu item, I made the changes and it immediately took effect, no reboot was necessary.  But, you probably should reboot anyway, just to make sure.  This is still Windows after all.  :-)

January 18, 2012

Skyrim Save Games

UPDATE: It's quite possible that disabling Speedstep is overkill.  I'm having success re-enabling Speedstep and disabling CORE PARKING.

I was doing some reading and noticed that one of the issues with Skyrim that is routinely brought up is the "saved game issue," which is basically the idea that corrupt saved games or oversized saved games cause problems.  In almost any Skyrim troubleshooting forum thread, you will have someone post a suggestion to disable auto-save.

According to this post:

(Format is a bit different but as with Morrowind the .ess file is basically a "mod" loading up the changes you've done to the game and your character to try and describe it fairly simple.)
This looks like a reasonably accurate documentation for the Skyrim saved game file format.

What this tells me is that the author of the post is not a native English speaker and that the game basically starts up and then applies a series of variables to the engine to continue the game where you left off.  Something like:

  1. Start at time 0, basic game start.
  2. Apply the character variables (height, weight, color, attributes)
  3. Apply the skills variables
  4. Position player
  5. Load the variables for the area completed (along with individual stats for each)
  6. Load the character inventory
  7. Fast forward time
  8. etc
  9. etc
  10. Start the game

This is in line with what looks like (haven't tried yet) easy cheats, like giving yourself weapons and giving yourself extra perks through the tilde (~) console commands.  This also matches up with my experience where the crashes didn't happen right away, only after a period of time (i.e. variables changing).

I wonder if something in the game engine variables is bad with my saved games, which causes periodic crashes.  One way to test this is to clear out all the saved game info (have to Google how to do that) in order to "start from scratch" and then start a new character and see when the crashes start.  Or another way is to examine the contents of the saved game and somehow turn on debugging and see if there are any exceptions that are generated by the game engine (there has to be a debugging mode) when the saved game is loaded.

I'm assuming this is normal behavior for any game loading, but maybe the way this was implemented by Bethesda is buggy in some yet-unknown-configurations.  Maybe it's a combination of issues and I'm fumbling like a thumbless monkey trying to build a nuclear reactor.  Analysis continues....

Skyrim Crashes Revisited

UPDATE:  I'm having success re-enabling Speedstep and disabling CORE PARKING.
  • 2011 DEC:
    • Installed Skyrim, Steam patched it to the 1.3 version, supposedly fixing a bunch of things
    • Skyrim works for a few days and then the dreaded "Reboot/Blinking Out" issue started.  In short, the entire system reboots.
  • 2012 JAN:
    • From late December to mid-January, troubleshooting in earnest.  All tests with other games, benchmarks, stress test, etc showed no issues. Roundup of testing here.
    • Charted voltages, temperatures, etc and found no issues whatsoever outside of Skyrim rebooting the system.
    • Bought and installed a 1000W Silverstone SST-ST1000-P PSU replacing the no-name 700W PSU that came with the system.  No change.
    • Tried a series of changes, systematically testing each one.  The one that made a change was disabling Intel Speedstep.  Skyrim crashed no more.
    • For about a week, no crashes and Skyrim is splendid!
    • The other day, I got a desktop crash.  Restarted Skyrim and no further issues.
    •  Last night, I got a reboot.  Not sure why, nothing had changed system-wise or application wise.  It's the same system that was running fine previously.  Reboots occured twice more and then stopped.  I doubt it's related, but I was at the mead poisoning mission just outside of Whiterun, in the dungeon just before combating the crazy conjurer that lived with the Skeevers.  I restarted the saved game and got beyong that part and no other issues for the next several hours.  (????????)
I am fairly confident (92%) that this is a problem with Skyrim programming.  I'm not bitter about having to buy a new PSU, because that just means ALL the critical components of my system are now high end pieces.  Also, if I ever decide to go dual or triple SLI, then I'm all set.

Skyrim being awesome
Skyrim not being awesome


Skyrim is an awesome game for sure, but these crashes are very, very frustrating.  It takes so much away from the game.  I really, really hope this gets resolved soon, either through Bethesda identifying an issue and releasing a patch or by me figuring out WTF is going on.

January 17, 2012

Website Password Requirements

Increasingly, I have more and more passwords to remember.  Mainly, this is due to different requirements (i.e. minimum length, upper/lower case, numbers, special characters, expiration every X months, etc).  Unfortunately, because I am human, my passwords need to make some sense to me.  I already have several "series" of passwords that I keep track of mentally.  But, the problem is when faced with a login screen, I sometimes have a hard time remembering which password is for what (I'm still fighting the urge to write passwords down).  If I could see what the password requirements are, I could make a reasonably successful guess at which password is what.

I believe I'm not the only one, so I'm going to start compiling a list of websites and their password requirements for reference.  If you want me to add a site, please leave a comment with the attributes for that site and I'll add it.



Site: amazon.com
Minimum Length: 6
Numbers Required: N
Upper/Lower Required: N
Special Chars Required: N
Expires: N

Site: google.com
Minimum Length: 8
Numbers Required: N
Upper/Lower Required: N
Special Chars Required: N
Expires: N

Site: newegg.com
Minimum Length: 6
Numbers Required: N
Upper/Lower Required: N
Special Chars Required: N
Expires: N
Other: No spaces, alphanumeric only

Site: isc2.org
Minimum Length: 8 - 20 characters
Numbers Required: Y
Upper/Lower Required: N
Special Chars Required: Y
Expires: N

Site: americanexpress.com
Minimum Length: 8 - 20 charactersNumbers Required: Y
Upper/Lower Required: N
Special Chars Required: Y (only %,&, _, ?, #, =, -)
Expires: N
Other: No spaces and NOT CASE SENSITIVE (!!)
NOTE: This means that they are storing the ACTUAL PASSWORD in their database and not using hashes.  Ugh.

Site: Apple/iTunes
Minimum Length: 6 characters
Numbers Required: N
Upper/Lower Required: N
Special Chars Required: N
Expires: N

Site: dell.com
Minimum Length: 6 characters
Numbers Required: Y
Upper/Lower Required: N
Special Chars Required: N
Expires: N

Site: zappos.com
Minimum Length: 8 characters
Numbers Required:Y/N (either this or Special Char)
Upper/Lower Required:Y
Special Chars Required:Y/N (either this or Numbers)
Expires: N
NOTE 1: Cannot use any of the previous 6 passwords.
NOTE 2: New requirements since they were hacked.
NOTE 3: Sends email notification of password change.

Zappos Hacked

I woke up to this email on Sunday:

 Here's the text of the email (bolding is mine):

First, the bad news:

We are writing to let you know that there may have been illegal and unauthorized access to some of your customer account information on Zappos.com, including one or more of the following: your name, e-mail address, billing and shipping addresses, phone number, the last four digits of your credit card number (the standard information you find on receipts), and/or your cryptographically scrambled password (but not your actual password).

THE BETTER NEWS:

The database that stores your critical credit card and other payment data was NOT affected or accessed.

SECURITY PRECAUTIONS:

For your protection and to prevent unauthorized access, we have expired and reset your password so you can create a new password. Please follow the instructions below to create a new password.

We also recommend that you change your password on any other web site where you use the same or a similar password. As always, please remember that Zappos.com will never ask you for personal or account information in an e-mail. Please exercise caution if you receive any emails or phone calls that ask for personal information or direct you to a web site where you are asked to provide personal information.

PLEASE CREATE A NEW PASSWORD:

We have expired and reset your password so you can create a new password. Please create a new password by visiting Zappos.com and clicking on the "Create a New Password" link in the upper right corner of the web site and follow the steps from there.

We sincerely apologize for any inconvenience this may cause. If you have any additional questions about this process, please email us at passwordchange@zappos.com.

After not panicking, I tried to figure out which password I used on the Zappos site, but alas I got this when I tried to access it:


Because clearly, blocking traffic from locations "outside the continental United States" is a security measure.  *rolleyes*  This is the Internet, folks - physical location is irrelevant.  If I'm attacking a website in the US, the exploit packets will NOT be coming from outside the US, unless I want it to.



January 13, 2012

Skyrim vs Hardware

UPDATE: It's quite possible that disabling Speedstep is overkill.  I'm having success re-enabling Speedstep and disabling CORE PARKING.
 
Tonight, I've been doing some researching on Skyrim because the whole "Speedstep fixing the crashing/freezing/rebooting/blinking out" thing bugs me.  There has to be something else that fixes it that doesn't involve me changing my hardware settings.

I've read several forums posts and articles and have made note of some "fixes" that others suggest.  I've hand-picked the ones that seem promising and tried to filter out the noise.

Just as a reminder, the symptoms/facts:
  • While playing Skyrim, the entire system will either reboot or freeze with no video output and looping audio.
  • System has not experienced ANY issues outside of Skyrim:
    • All other games run fine
    • Benchmarking tools (CPU, memory and GPU centric)
    • Stress test tools (CPU, memory and GPU centric)
    • Running three stress tests each focusing on CPU, memory and GPU running AT THE SAME TIME (can I get a "hey hey" for that one?)
    • These tools and games run for hours and hours (up to seven hours)
  • Replacing the 700W power supply with a Silverstone 1000W PSU made no difference
  • Reducing the CPU, GPU and RAM speeds have made no difference
  • Running in window mode made no difference
  • Disabling Speedstep has made a difference - no crashes yet
I have some previous posts that detail troubleshooting and findings.  Instead of linking each one, here is a compilation post with links to the previous posts.

Some conclusions that I currently hold:
  1. My system is solid - hardware and software showed no issues outside of Skyrim
  2. This is not a hardware problem
  3. Bethesda is doing something different with it's programming: Skyrim can boot into the main menu is less than 5 seconds and then load a save game and have you completely in the world of Skyrim where you left off in about seven seconds.  No other game can do this.
  4. Skyrim is suffering from the same issues as Fallout 3 and Oblivion since their "Creation Engine" seems to be built on the Gamebryo engine.
Some things I'd like to try when I'm feeling researchy and analytical:
  • Assign the "TESV.exe" process to a single CPU in the Task Manager
  • Assign a Higher Priority to the "TESV.exe" process in the Task Manager
  • Changing some configuration file settings 
    • Skyrim.ini -> go to [General] and add the line "iNumHWThreads=4" (or 3 or 2)
    • Skyrim.ini -> got to [HAVOK] an add the line "iNumThreads=1"
From researching engine related issues, I found multiple references to similar Fallout 3 issues.  It seems that the issues are focused on Multithreading synchronization issues and reducing the number of threads and CPUs handling the threads could help.  That variable is apparently the "iNumHWThreads".

In the meantime, here's a panorama from the Aftland Elevator:



January 12, 2012

Skyrim Shots

Click on the photos for a larger view.

Taking my time crossing a river.  Yes, that's a Battle Axe.  Yes, I'm OK with that.

My what minty breath you have!
The action option is "Talk to Dog"

Look!  I got an arrow in the knee!

Skyrim Troubleshooting Compilation

UPDATE: It's quite possible that disabling Speedstep is overkill.  I'm having success re-enabling Speedstep and disabling CORE PARKING.

For the past day days week, I've been troubleshooting a peculiar problem with my new system.  So far I have three four five posts.  I'm going to use this as a summary post with links to the different parts.

BUT FIRST, a quick summary: new system runs everything great - no issues with anything EXCEPT for Skyrim, which causes a system freeze/reboot sometimes within a minute, sometimes after an hour or two.  After some troubleshooting, the problem is hopefully fixed.


The current fix involves turning off Intel Speedstep by disabling three options in the BIOS:

CPU Enhanced Halt (C1E)
C3/C6 State Support
CPU EIST Function


I posted some charts and data from running with Intel Speedstep off in part 5.

Posts related to troubleshooting Skyrim Blinking Out/Crashing/Rebooting:
For the initial post describing the system and initial troubleshooting steps (no resolutions but with fancy graphs), click on Part 1.

For the second post with more troubleshooting steps at the suggestion of a friendly Nexus Forums user and with crazy load testing (but no resolutions), click on Part 2.

The third post describes the replacement of the Power Supply Unit (no-name 700W PSU) that came with the system with a 1000W Silverstone SST-ST1000-P, which resulted in no resolutions BUT with some "gut check" tests that produced some positive results.  To read this, click on Part 3.

The fourth post is what the system looks like with Intel Speedstep disabled.  Please universe, let it be this one; I just want to play Skyrim!  This can be found in Part 4.

The fifth (and hopefully last) post on this topic has some charts and data from running with Intel Speedstep off.  Go to part 5 to read this..

I promise at least one photo per post for your entertainment!

Here's one on my CPU opened showing the water cooling unit on the CPU (bottom right) with the old PSU:




Skyrim Blinking Out/Rebooting Part 5 [SOLVED]

UPDATE: It's quite possible that disabling Speedstep is overkill.  I'm having success re-enabling Speedstep and disabling CORE PARKING.

After putting in some hours last night, I'm fairly confident that the problem with Skyrim rebooting my system is solved by disabling Intel Speedstep.  No issues whatsoever running Skyrim.  Naturally, since the idea behind Speedstep is to lower the power consumption, it stands to reason that increased power consumption and increased heat would be a concern.  I can't do anything about the power consumption, but I should really worry about the heat.  To that end, here are some charts I put together that hopefully will help add to the discussion regarding Skyrim reboots and the effects of disabling Speedstep. 

To disable Speedstep, go into your BIOS and disable the following "Advanced CPU" Core Features:

CPU Enhanced Halt (C1E)
C3/C6 State Support
CPU EIST Function


The following charts demonstrate the effect on CPU frequency and voltage when Speedstep is enabled and disabled:
CPU Frequency and Voltage under load with Speedstep ENABLED

CPU Frequency and Voltage under load with Speedstep DISABLED
 The following charts illustrate the effect on temperature on each core with Speedstep enabled and disabled.  As expected, the GPU temperature was not affected:

CPU/GPU Temperatures under load with Speedstep ENABLED

 
CPU/GPU Temperatures under load with Speedstep DISABLED

The data show that with Speedstep enabled, we get slightly lower CPU temperatures in the cores never going beyond 60C and possibly slightly lower CPU loads (this probably needs a more controlled test to confirm).  After Speedstep is disabled, we see core temperatures getting closer to 70C, still below the danger zones, although getting really close to it.  It's still not clear what the temperature max is for the 2600K, but Intel's specs show a "TCase" of 72.6C, which probably doesn't mean as much as the TJMax, which, if I read the Core MSRs on my CPU properly, comes out to 01100010 or 0x62 or 98C:

MSR 0x000001A2        0x00000000    0x00621200

I'm looking at this as the temperature at which the CPU will call it quits.  With a 20% buffer, I think I am comfortably outside the overheating temperature zone for this CPU.

Presumably the power consumption goes up with Speedstep disabled, but I have no measuring tools for that.  HWInfo64 does report that the +12V at a steady 12.025V - 12.074V.  Note: with my old no-name PSU, it varied between 11.436V and 11.976V.
So far so good.  I hope this helps someone out there.


January 11, 2012

Skyrim Blinking Out/Rebooting Part 4


 After the initial possible success of disabling Speedstep, I decided to re-enable Speedstep and sure enough, the reboots were seen again.  I also tried to adjust the Windows 7 power settings to 100%/Performance, but to no avail.  The informal causal relationship appears to be if the CPU frequency is changing, Skyrim does not like it.  Even under heavy load, with Speedstep enabled, I see frequency jumps from 1600MHz to 3700MHz varying for each core and voltage jumps from 0.966V to 1.281V.  It appears Skyrim just does not like this behavior.

Chart showing the CPU Frequency and Voltage fluctuations under load with Speedstep enabled.
With Speedstep disabled, we now have a steady 3700MHz for each core and voltage at a steady 1.276V (varying slightly, but only by a few fractions).

Chart showing the CPU Frequency and Voltage with Steepstep disabled.

HWInfo64 showing a steady clock on all four cores even in idle

CPU-Z CPU information
A few more days of no crash/reboots in Skyrim should be enough to show success.

If you want to try this, these are the items in the BIOS that I set to [Disabled] to turn off Speedstep:

CPU Enhanced Halt (C1E)
C3/C6 State Support
CPU EIST Function

 
Below is a screenshot of the BIOS screen where the settings are disabled marked in clumsy red arrows.  Note that this was stolen from a website so the clock ratio and frequency are not mine (WOW!):

Change the marked items to disabled to turn off Speedstep

Note: I also increased the memory timing from 666.7MHz to 800MHz which gives us 1600MHz on the FSB.

January 10, 2012

Skyrim Blinking Out/Rebooting Part 3

I bought and installed a Silverstone SST-ST1000-P PSU in hopes that the un-named 700W PSU that came with my system was causing the problems with Skyrim.  Below is a picture of the Silverstone (top) and the unnamed PSU (bottom).  The SST-ST1000-P is a 1000W PSU with 80A on the +12V (single) rail, which should be more than enough for even three GTX 570s.  It can also peak at 1100W.

Silverstone SST-ST1000P 1000W PSU on top, unnamed 700W PSU below

It's also modular, which is really nice.

However, after carefully completing the PSU swap (took a little over an hour), I fired up Skyrim and within a minute saw the same reboot problem.  After cursing under my breath, I started to plot the next steps.  First, I am fine with the new PSU.  It replaces a PSU that is questionable at best, eliminating that as a potential problem.  I then wrote down some other "gut check" things to try (that wouldn't cost money) and started to go through them:

  • Update GPU BIOS - I was surprised to find that there is a more recent BIOS for the GPU, adding a ".01" - seems to be a minor update, but I tried it anyway.  No luck, same symptoms.
  • Re-install DirectX - I was not able to find a way to do this, since it's a core component of Windows 7.  Did DirectX validation and checks.  No change.
  • Underclock/Overclock the GPU.  No change either way, other than slower graphics when the settings are lowered.
  • Underclock/Overclock the CPU.  No change either way.
  • Examine the memory speed.  I have four 4GB Corsair memory chips, with a rating of 800MHz, DDR3.  But, out of the box, the system booted with 1333MHz memory.  I adjusted this to 1600MHz, but while it looked promising, the same problems (reboot) occurred.
  • Disable Intel SpeedStep and variable VCORE.  This took some research and I learned a lot about how these new features from Intel work.  I had to disable three BIOS entries in the GIGABYTE Z68MA-D2H-B3 INTEL Z68 motherboard:

Go to Advanced Frequency Settings->Advanced CPU Core Features and disable the following:
CPU Enhanced Halt (C1E)
C3/C6 State Support
CPU EIST Function


After obtaining a fixed frequency and VCORE for the i7 2600K, I played Skyrim for over an hour without a reboot!  I shut it down since I was at the end of my day.  This seemed to have made a difference, although I am at best cautiously optimistic about this being the fix.

After going back and looking at the stats from previous tests, the CPU VCORE and CPU Frequencies were all over the place.  This is by design of course, but now I'm thinking that Skyrim programming may be hitting hardware more directly than other games.  It may be my tired state, but consider the following (note this may just be crazy thoughts):

  • Skyrim is a port from consoles.  Console games are written for specific hardware, giving programmers more  access to low-end functions than say a DirectX game.
  • Skyrim load times are ridiculous.  From hitting "Play" it takes about 5 seconds for the main screen to come up then maybe another 5-10 seconds to load a saved game.  Compare this to other games and you know they are doing something different with Skyrim.  Why would other games take so long to load compared to Skyrim when Skyrim is clearly a larger game?  Perhaps they skipped a lot of the frameworks that other games are built on.  Even something like the benchmark tool Heaven DX11 takes a while just to load all the textures.
  • If the problem is the variable CPU frequency/VCORE, why would I not be able to replicate the Skyrim reboots with other games?

In any case, cautiously optimistic is how I describe this phase of troubleshooting.  Next, I'm going to re-enable the BIOS settings and try to limit the CPU Speedstep via Windows 7 Power Settings instead.

Just in case this is a red herring, I'm planning on hitting the Sound Blaster X-FI card next by completely removing it.  After that, I'm completely out of ideas.

If you haven't already, be sure to see part 1 and part 2 of the Skyrim troubleshooting saga.

January 09, 2012

Skyrim Blinking Out/Rebooting Part 2

Thanks to the suggestions of a very helpful The Nexus Forums user (who successfully fixed his Skyrim problems by disabling his SLI config), I ran more tests and found the following:

  • Cleaned out drivers per these instructions.  Still rebooted. 
  • Running Prime95 for over 7 hours yielded no errors with RAM.
  • Running Linx for about an hour yielded no errors with RAM.
  • Running OCCT yielded no errors with GPU RAM.
  • Running OCCT "CPU:OCCT" test with large data sets for an hour showed no errors or issues.
  • Running Kombustor yielded no errors or crashes.
  • Running all of the above AT THE SAME time yielded no errors or crashes.
What this tells me is my system is stable - drivers and hardware all work well together.

I found some data from 03 JAN 2012 when I started earnestly troubleshooting the Skyrim crashes.  This chart shows what's happening with the GPU from idle (22:09:34), starting Skyrim (22:09:58) and then the reboot (22:17:01, end of the chart).  This involved loading the latest saved game and running through the area outside the College of Winterhold.  This session lasted 7 minutes.


As you can see, the GPU temperature and fan speeds are fairly consistent and way below the danger zone.  I did this a few more times and yielded similar results.  Basically, it a good run with no indicators of any problems related to temperature or load.

So, it's still crashing.  Our friendly Nexus Forums user summed up my feelings on this:
Unlike others I do feel that the fix has to do with ones system configuration/settings and not the game itself.  I really strongly feel it has to do with video drivers and something a bit deeper.  Quite frankly this problem isn't as wide spread as it potentially could have been with the vast amount of people who've purchased & pirated the game on PC.  At one point Steam had 500,000 people playing the game at the same time so my feeling is if it was the game a whole lot more people would be complaining.
Today, I found at a local store (Conrad's) a Silverstone ST1000-P 1000W for cheaper than NewEgg(!) and will install it tonight.  Hopefully the problem is that Skyrim does *something* that causes a huge draw on the PSU and this fixes the problem.  If not, in addition to the other quality components I have that have been thoroughly tested, I will also have a quality PSU.  Then, I'll just wait for Skyrim to be patched.

More to come...

January 06, 2012

Skyrim Blinking Out/Rebooting

I ordered a new gaming system from CyberPowerPC right after Thanksgiving, taking advantage of their sale at the time.  It was a great deal and the folks at CyberPowerPC were great with the order and the subsequent follow-ups, informing me of a backordered CPU and offering good alternatives.  The system I ended up with came in before Christmas in great shape and it worked out of the box (the shipping, packing, etc was great, no issues whatsoever).

I should note upfront that I've contacted them and they've graciously offered to upgrade my power supply.

Here's the specs:

ITEM            DESCRIPTION                                           QTY 
=============== ===================================================== === 
KB-152-101      BLACK XTREME GEAR MULTIMEDIA/ INTERNET USB KEYBOARD     1 
MO-115-101      BLACK XTREME GEAR OPTICAL USB GAMING MOUSE              1 
NC-USB-117      EDIMAX EW-7811Un IEEE NANO 802.11n USB WIRELESS ADAPTER 1 
HD-403-308      2TB SATA III 6.0 GB/S 3.5" HDD                          1 
RM-317-802      4GB CORSAIR VENGEANCE 1600MHZ DDR3                      4 
FA-WATER-101    ASETEK 510LC 120MM WATERCOOLER                          1 
FA-104-116      CASE FAN 120 MM                                         1 
CS-157-519      BLACK THERMALTAKE COMMANDER MID TOWER NO POWER          1 
CD-128-101      BLACK LITE-ON 24X DVDRW                                 1 
MR-104-101      12-IN-ONE INTERNAL CARD READER                          1 
MB-374-101      GIGABYTE Z68MA-D2H-B3 INTEL Z68 CROSSFIRE DDR3 SATA3    1 
               USB3.0 MICRO ATX LGA 1155
PS-119-108      APEVIA 700WATT POWER SUPPLY                             1 
SC-102-144      CL SB X-FI XTREME AUDIO RETAIL                          1 
SW-170-113      WINDOWS 7 HOME PREMIUM SP1 64-BIT                       1 
BOX1            SYSTEM BOX AND FOAM                                     1 
SERVICE-201     SOUND ABSORBING FOAM ON SIDE, TOP AND BOTTOM PANELS     1 
SERVICE-202     POWER SUPPLY GASKET                                     1 
SERVICE-104     ANTI-VIBRATION FAN MOUNTS                               1 
CU-208-206      INTEL I7-2600K 3.40 GHZ 8M LGA 1155 RETAIL              1 
VC-207-108      GIGABYTE GEFORCE GTX 570 1.2GB DDR5 PCI-E               1 


It ran every recent game I had (HL2, Portal 2, GRID, Just Cause 2, Batman:Arkham City, Assassin's Creed 1/2/Brotherhood, etc) at the highest settings without complaint. But I really got this rig for Skyrim.
I wanted to finally see the Aurora Borealis.
Of course, as Murphy predicted, the problems started when I installed "Elder Scrolls V: Skyrim".  At first, it was great, and the adventures started.  But after a few hours, everything turned off and the system restarted.  It could have been anything at this point, and I didn't think much of it until it started happening over and over again (Skyrim is a looong game).  I was unable to replicate the problem even after hours and hours of Batman:AC and other games.

Let the troubleshooting begin!

Symptoms:
  • Skyrim causing system to reboot.
  • No other issues observed on system

Related Info:
  • Skyrim is a DX9 game and is designed to run on anything from XP to 64-bit Windows 7.
  • Batman:AC and other games utilize DX11 and is significantly more CPU and GPU intensive (IMO).
  • No issues with ANY other game.
  • OS is 64-bit Windows 7.
  • When system reboots, temperature and fans are not high.
  • Logging with HWINFO64 and other tools confirmed this.
  • Power supply that came with system is possibly APEVIA brand (boo!), but I haven't removed it from the case to confirm this.
  • Power Supply specs (from side panel pic):
    •   700W (Sticker says ATX-CV700W - can't find info on this PN)
    •   +3.3V = 38A
    •   +5V = 40A
    •   +12V1 = 23A
    •   +12V2 = 26A
  • System should need under 600W and the +12V rails seems to be sufficient (based on Internet research)
  • Running two 21" LCD monitors: One monitor at 1920x1080 and the other on 1680x1050

Troubleshooting Steps so far that had no effect on the symptoms:
  • Lowered the clock rates on the video card (GIGABYTE 570 is OC a little)
  • Lowered settings and resolution in SKYRIM
  • Installed latest BETA Nvidia drivers (290.53)
  • Re-installed latest STABLE Nvidia drivers (285.62)
  • Cleaned out old video drivers
  • Disabled the (unused) onboard video (Intel HD)
  • Updated Creative Labs drivers (SB X-FI XTREME)
  • Disabled the (unused) onboard audio
  • Uninstalled "extra" stuff that Nvidia loads (i.e. 3d vision drivers)
  • Verified no OC on the CPU
  • Removed 2nd monitor
  • Ran Skyrim in Window mode
  • Reseated all components (memory, video card)
  • Verified power and cable connections
  • Reinstalled Skyrim

Last night I loaded the Heaven DX11 Benchmark to do some load testing.  It ran for over six hours without any issues!

I also logged the system stats using HWINFO64 and the system was under heavy load (much heavier than Skyrim and Batman).  Here's a summary of the results:

Test period: 01:30:45 - 06:42:35

Starting GPU Temp: 53C
Max GPU Temp: 72C
Ending GPU Temp: 65C

Stating GPU Fan: 2400RPM (48%)
Max GPU Fan: 3510RPM (70%)
Ending GPU Fan: 3330RPM (66%)

Starting CPU Temp: 41C
Max CPU Temp: 57C
Ending CPU Temp: 47C

Starting CPU Load: 0.8%
Max CPU Load: 31.7%
Ending CPU Load: 0.9%

The chassis temperature range for the test was: 31C - 34C.

GPU Memory allocation went from 130MB to 836MB and then back to 137MB, following a linear curve correlating the load.

Here are some charts from the data gathered:
GPU Temperature stats for the entire 6-hour run.  Note that it stabilized at 71-73C, well below the "danger zone"

GPU Fan Speed stats for the entire 6-hour run.  Note that it stabilized at around 69%, well below the "danger zone"

CPU temperature, CPU utilization and Chassis Temperature stats for the entire 6-hour run.  These are well below the "danger zones"
Since Skyrim is a DX9 game, I also ran a benchmark (15 minutes) using the DX9 settings and found the same stats as above.

It's interesting to note that when actually playing Skyrim, there is a CPU load on only one core (out of four).

I had originally been leaning towards the power supply being the issue, but the test results (and lack of issues with other games) causes me to lean more towards the Skyrim code doing *something* that causes my system to have issues.

Skyrim was released in November and it already has three (four?) patches and unsubstantiated reports by other gamers indicate that some Internet developers have made patches that bypass and optimize some poorly-written routines in the game, improving performance.  This leads me to believe that something in Skyrim is causing the "blinking out/rebooting" issue I'm (and a lot of other people) are experiencing.

UPDATE 07 JAN 2012: Tested with Kombustor DX9 and DX10, 8X MSAA, 1920x1080, POST-FX, FULLSCREEN for about an hour with no issues or crashes.


UPDATE 07 JAN 2012: I was able to play Skyrim without crashing by starting a new game.  I think my save file is corrupted somehow!  Will test this out thoroughly and will post findings later. This was a false lead - it still crashed, although it took a while longer.


See Part 2 Here.