Tema: stai Jums ir prasom, kompiuteris..
Autorius: Krivis Lepuonis
Data: 2011-05-31 12:21:13
CERN apdoroja itin didelius kiekius informacijos (keli terabaitai per 
sekunde),
ir tame sraute buvo pastebetas nuolatinis dalinis duomenu praradimas (data 
corruption)..
tuomet buvo atlikti specialus testai, stai rezultatai:

The program
The analysis looked at data corruption at 3 levels:

Disk errors.
The wrote a special 2 GB file to more than 3,000 nodes every 2 hours
and read it back checking for errors for 5 weeks. They found 500 errors on 
100 nodes.

Single bit errors.
10% of disk errors.

Sector (512 bytes) sized errors.
10% of disk errors.

64 KB regions.
80% of disk errors. This one turned out to be a bug in WD disk firmware 
interacting with 3Ware controller cards which CERN fixed by updating the 
firmware in 3,000 drives.

RAID errors.
They ran the verify command on 492 RAID systems each week for 4 weeks. The 
disks are spec’d at a Bit Error Rate of 10^14 read/written. The good news is 
that the observed BER was only about a 3rd of the spec’d rate. The bad news 
is that in reading/writing 2.4 petabytes of data there were some 300 errors.

Memory errors.
Good news: only 3 double-bit errors in 3 months on 1300 nodes. Bad news: 
according to the spec there shouldn’t have been any. Only double bit errors 
can’t be corrected.
All of these errors will corrupt user data.
When they checked 8.7 TB of user data for corruption – 33,700 files – they 
found 22 corrupted files, or 1 in every 1500 files