Permalänk
Medlem

"NANs detected on GPU"

Någon som har koll på vad detta betyder och hur det åtgärdas..?
ATM så kan min GPU0-klient endast folda en enhet för att sedan kräva omstart, har provat med ett G80-baserat kort och ett G92-baserat och får samma resultat. Har även provat med både 182 och 178.22 WHQL-drivarna, men utan bättre resultat. Om något så var 182 stabilare då det på den ibland gick att köra både 3 och 4 i rad utan EUE, men annars typ lika..

Log(från precis efter en omstart) enligt nedan:

--- Opening Log file [February 5 19:16:58 UTC] # Windows GPU Console Edition ################################################# ############################################################################### Folding@Home Client Version 6.23 http://folding.stanford.edu ############################################################################### ############################################################################### Launch directory: D:\Folding\GPU0 Executable: D:\Folding\GPU0\Folding@home-Win32-GPU.exe Arguments: -gpu 0 -verbosity 9 [19:16:58] - Ask before connecting: No [19:16:58] - User name: Rainbowsixteen (Team 37451) [19:16:58] - User ID: A56BAD355ACB8E5 [19:16:58] - Machine ID: 2 [19:16:58] [19:16:58] Work directory not found. Creating... [19:16:58] Could not open work queue, generating new queue... [19:16:58] - Preparing to get new work unit... [19:16:58] + Attempting to get work packet [19:16:58] - Will indicate memory of 8189 MB [19:16:58] - Detect CPU. Vendor: GenuineIntel, Family: 6, Model: 15, Stepping: 6 [19:16:58] - Connecting to assignment server [19:16:58] Connecting to http://assign-GPU.stanford.edu:8080/ [19:16:58] - Autosending finished units... [February 5 19:16:58 UTC] [19:16:58] Trying to send all finished work units [19:16:58] + No unsent completed units remaining. [19:16:58] - Autosend completed [19:16:58] Posted data. [19:16:58] Initial: 43AB; - Successful: assigned to (171.67.108.11). [19:16:58] + News From Folding@Home: GPU folding beta [19:16:59] Loaded queue successfully. [19:16:59] Connecting to http://171.67.108.11:8080/ [19:16:59] Posted data. [19:16:59] Initial: 0000; - Receiving payload (expected size: 47170) [19:17:00] - Downloaded at ~46 kB/s [19:17:00] - Averaged speed for that direction ~46 kB/s [19:17:00] + Received work. [19:17:00] + Closed connections [19:17:00] [19:17:00] + Processing work unit [19:17:00] Core required: FahCore_11.exe [19:17:00] Core found. [19:17:00] Working on queue slot 01 [February 5 19:17:00 UTC] [19:17:00] + Working ... [19:17:00] - Calling '.\FahCore_11.exe -dir work/ -suffix 01 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 2428 -version 623' [19:17:01] [19:17:01] *------------------------------* [19:17:01] Folding@Home GPU Core - Beta [19:17:01] Version 1.19 (Mon Nov 3 09:34:13 PST 2008) [19:17:01] [19:17:01] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 [19:17:01] Build host: amoeba [19:17:01] Board Type: Nvidia [19:17:01] Core : [19:17:01] Preparing to commence simulation [19:17:01] - Looking at optimizations... [19:17:01] - Created dyn [19:17:01] - Files status OK [19:17:01] - Expanded 46658 -> 252912 (decompressed 542.0 percent) [19:17:01] Called DecompressByteArray: compressed_data_size=46658 data_size=252912, decompressed_data_size=252912 diff=0 [19:17:01] - Digital signature verified [19:17:01] [19:17:01] Project: 5765 (Run 11, Clone 77, Gen 34) [19:17:01] [19:17:01] Assembly optimizations on if available. [19:17:01] Entering M.D. [19:17:07] Working on Protein [19:17:08] Client config found, loading data. [19:17:08] Starting GUI Server [19:17:08] mdrun_gpu returned [19:17:08] NANs detected on GPU [19:17:08] [19:17:08] Folding@home Core Shutdown: UNSTABLE_MACHINE [19:17:11] CoreStatus = 7A (122) [19:17:11] Sending work to server [19:17:11] Project: 5765 (Run 11, Clone 77, Gen 34) [19:17:11] - Read packet limit of 540015616... Set to 524286976. [19:17:11] - Error: Could not get length of results file work/wuresults_01.dat [19:17:11] - Error: Could not read unit 01 file. Removing from queue. [19:17:11] Trying to send all finished work units [19:17:11] + No unsent completed units remaining. [19:17:11] - Preparing to get new work unit... [19:17:11] + Attempting to get work packet [19:17:11] - Will indicate memory of 8189 MB [19:17:11] - Connecting to assignment server [19:17:11] Connecting to http://assign-GPU.stanford.edu:8080/ [19:17:12] Posted data. [19:17:12] Initial: 43AB; - Successful: assigned to (171.67.108.11). [19:17:12] + News From Folding@Home: GPU folding beta [19:17:12] Loaded queue successfully. [19:17:12] Connecting to http://171.67.108.11:8080/ [19:17:12] Posted data. [19:17:12] Initial: 0000; - Error: Bad packet type from server, expected work assignment [19:17:12] - Attempt #1 to get work failed, and no other work to do. Waiting before retry. [19:17:31] + Attempting to get work packet [19:17:31] - Will indicate memory of 8189 MB [19:17:31] - Connecting to assignment server [19:17:31] Connecting to http://assign-GPU.stanford.edu:8080/ [19:17:32] Posted data. [19:17:32] Initial: 43AB; - Successful: assigned to (171.67.108.11). [19:17:32] + News From Folding@Home: GPU folding beta [19:17:32] Loaded queue successfully. [19:17:32] Connecting to http://171.67.108.11:8080/ [19:17:33] Posted data. [19:17:33] Initial: 0000; - Receiving payload (expected size: 97076) [19:17:34] - Downloaded at ~94 kB/s [19:17:34] - Averaged speed for that direction ~70 kB/s [19:17:34] + Received work. [19:17:34] Trying to send all finished work units [19:17:34] + No unsent completed units remaining. [19:17:34] + Closed connections [19:17:39] [19:17:39] + Processing work unit [19:17:39] Core required: FahCore_11.exe [19:17:39] Core found. [19:17:39] Working on queue slot 02 [February 5 19:17:39 UTC] [19:17:39] + Working ... [19:17:39] - Calling '.\FahCore_11.exe -dir work/ -suffix 02 -priority 96 -nocpulock -checkpoint 3 -verbose -lifeline 2428 -version 623' [19:17:39] [19:17:39] *------------------------------* [19:17:39] Folding@Home GPU Core - Beta [19:17:39] Version 1.19 (Mon Nov 3 09:34:13 PST 2008) [19:17:39] [19:17:39] Compiler : Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.762 for 80x86 [19:17:39] Build host: amoeba [19:17:39] Board Type: Nvidia [19:17:39] Core : [19:17:39] Preparing to commence simulation [19:17:39] - Looking at optimizations... [19:17:39] - Created dyn [19:17:39] - Files status OK [19:17:39] - Expanded 96564 -> 489240 (decompressed 506.6 percent) [19:17:39] Called DecompressByteArray: compressed_data_size=96564 data_size=489240, decompressed_data_size=489240 diff=0 [19:17:39] - Digital signature verified [19:17:39] [19:17:39] Project: 5753 (Run 1, Clone 70, Gen 59) [19:17:39] [19:17:39] Assembly optimizations on if available. [19:17:39] Entering M.D. [19:17:46] Working on Protein [19:17:50] Client config found, loading data. [19:17:50] Starting GUI Server [19:19:52] Completed 1% [19:21:54] Completed 2%

Permalänk
Medlem

Re: "NANs detected on GPU"

Citat:

Ursprungligen inskrivet av AtreX
Någon som har koll på vad detta betyder och hur det åtgärdas..?

Den enda åtgärden jag har sett (som dock bara gäller för Vista) är att köra exe-filen i XP compatibility mode.

Permalänk
Medlem

Du har inte överklockat grafikkortet lite för mycket ?

Permalänk
Medlem

Damm? Med tiden så blir kortet till slut för varmt om man inte håller det rent..

Visa signatur

WS: Asus P8Z77-I Deluxe mITX | Intel 3770K@4.6 | NH-U12P | Asus 780 GTX | Corsair 2x8GB 1600Mhz CL9 | Samsung 840 512GB | Ubuntu 16.04.3 x86_64 | Corsair AX750 | 2x Dell U2412M | Puppe.se | NAS: i7 860, 16GB DDR3, GA-P55M-UD4, FD Define R3, 8x2TB Samsung F4EG, Serveraid M1015, EVGA 750W G2 PSU, FreeBSD x64

Permalänk
Medlem

Jag har samma problem .. minst 2-3 ggr dagligen krashar mina 2 kort med nans eller unstable machine osv ..

se tråd http://sweclockers.com/forum/showthread.php?s=&threadid=82982...

men i stort , Stanford ger oss skitunits som krashar på vissa kort.
Jag har ett 8800GT kort som rullar på utan minsta problem medan 260 GTX krashar konstant..

Visa signatur

RYZEN 7-3700X ,ASUS Crosshair VI HERO ,G.skill Ripjaws V Black 3600MHz 32GB ,Noctua NH-D15-SE-AM4 ,Samsung 960 EVO NVMe M.2 SSD 250GB , ASUS Geforce RTX 3080ti TUF Gaming ,Fractal Design Define R5 , Corsair RM850x 850W

Permalänk
Medlem

Jag har haft liknande problem under en längre tid - men mest på mina Vista-maskiner - XP har varit ganska stabilt. Därför har jag nu installerat om XP på några av de datorer som tidigare haft Vista och nu verkar det tugga på bättre.
Jag kan också överklocka GPUerna högre utan EUE-problem på Win XP.

Visa signatur

Foldingstatistik foldingfarm[4pES6276@3.2GHz/4p6176SE@2.7GHz/4pES6282@3.0/4pES6276@3.0]