Samsung SSD 840: Testing the Endurance of TLC NAND
by Kristian Vättö on November 16, 2012 10:18 AM EST- Posted in
- Storage
- SSDs
- Samsung
- TLC
- Samsung SSD 840
NAND endurance is something that always raises questions among those considering a move to solid state storage. Even though we have showed more than once that the endurance of today's MLC NAND based SSDs is more than enough for even enterprise workloads, the misconception of SSDs having a short lifespan still lives. Back in the day when we had 3Xnm MLC NAND with 5,000 P/E cycles, people were worried about wearing our their SSDs, although there was absolutely nothing to worry about. The move to ~20nm MLC NAND has reduced the available P/E cycles to 3,000, but that's still plenty.
We have tested MLC NAND endurance before but with the release of Samsung SSD 840, we had something new to test: TLC NAND. We have explained the architectural differences between SLC, MLC and TLC NAND several times by now, but I'll do a brief recap here (I strongly recommend reading the detailed explanation if you want to truly understand how TLC NAND works):
SLC | MLC | TLC | |
Bits per Cell | 1 | 2 | 3 |
P/E Cycles (2Xnm) | 100,000 | 3,000 | 1,000 |
Read Time | 25us | 50us | ~75us |
Program Time | 200-300us | 600-900us | ~900-1350us |
Erase Time | 1.5-2ms | 3ms | ~4.5ms |
The main difference is that MLC stores two bits per cell, whereas TLC stores three. This results in eight voltage states instead of four (also means that one TLC cell has eight possible data values). Voltages used to program the cell are usually between 15V and 18V, so there isn't exactly a lot room to play with when you need to fit twice as many voltage states within the same space. The problem is that when the cell gets cycled (i.e. programmed and erased), the room taken by one voltage state increases due to electron trapping and current leakage. TLC can't tolerate as much change in the voltage states as MLC can because there is less voltage headroom and you can't end up in a situation where two voltage states become one (the cell wouldn't give valid values because it doesn't know if it's programmed as "110" or "111" for example). Hence the endurance of TLC NAND is lower; it simply cannot be programmed and erased as many times as MLC NAND and thus you can't write as much to a TLC NAND based SSD.
No manufacturer has openly wanted to discuss the endurance of TLC, so the numbers we have seen before have been educated guesses. 1,000 - 1,500 P/E cycles is what I've heard for TLC NAND. The reality can also be different from what manufacturers claim as we discovered in the Intel SSD 335 (though there is a high probability that it's just a firmware bug), so actually testing the endruance is vital.
There was one obstacle, though. Samsung does not report NAND writes like Intel does and without NAND writes we can't know for sure how much data is written to the NAND because of write amplification. Fortunately, there is a a workaround: I wrote incompressible 128KB sequential data (QD=1) to the drive and took down the duration of each run and the Wear Leveling Count (similar to Media Wear Indicator). If I know the average write speed and the duration, I can figure out how much I wrote to the drive. Sequential large block-size data should also result in write amplification near 1x because the data is sequential and thus doesn't fragment the drive. I then compared the amount of data I wrote to the WLC values I had recorded:
Samsung SSD 840 (250GB) Endurance Testing | |
Total Amount of Data Written | 92,623 GiB |
Total Amount of WLC Exhausted | 34 |
Estimated Total Amount of P/E Cycles | 1,064 |
Estimated Total Write Endurance | 272,420 GiB |
It seems that 1,000 P/E cycles is indeed accurate. The raw Wear Leveling Count seems to indicate the amount of exhausted P/E cycles as it's inversely proportional to the normalized WLC value and once it hits 1,000, the WLC will hit zero.
Note that if Samsung's WLC is anything like Intel's Media Wear Indicator, when the normalized counter value drops to 0 there's still a good amount of endurance actually left on the NAND (it could be as high as another 20 - 30%). At least with Intel drives, the MWI hitting 0 is a suggestion that you may want to think about replacing the drive and not a warning of imminent failure.
Conclusions
1,000 P/E cycles may not sound much but when it's put into perspective, it's still plenty. Client workloads rarely exceed 10GiB of writes per day on average and write amplification should stay within reasonable magnitudes as well:
SSD Lifetime Estimation | ||||
NAND | MLC—3K P/E Cycles | TLC—1K P/E Cycles | ||
NAND Capacity | 128GiB | 256GiB | 128GiB | 256GiB |
Writes per Day | 10GiB | 10GiB | 10GiB | 10GiB |
Write Amplification | 3x | 3x | 3x | 3x |
Total Estimated Lifespan | 35.0 years | 70.1 years | 11.7 years | 23.4 years |
Of course, if you write 20GiB a day, the estimated lifespan will be halved, although we are still looking at several years. Even with 30GiB of writes a day the 256GiB TLC drive should be sufficient in terms of endurance. Write amplification can also go over 10x if your workload is heavily random write centric, but that is more common in the enterprise side - client workloads are usually much lighter.
Furthermore, it should be kept in mind that all SMART values that predict lifespan are conservative; it's highly unlikely that your drive will drop dead once the WLC or MWI hits zero. There is a great example at XtremeSystems where a 256GB Samsung SSD 830 is currently at nearly 6,000TiB of writes. Its WLC hit zero at 828TiB of writes, which means its endurance is over seven times higher than what the SMART values predicted. That doesn't mean all drives are as durable but especially SSDs from NAND manufacturers (e.g. Intel, Crucial/Micron, Samsung etc.) seem to be more durable than what the SMART values and datasheets indicate, which isn't a surprise given that they can cherry-pick the highest quality NAND chips.
48 Comments
View All Comments
Mumrik - Friday, November 16, 2012 - link
10GiB a day would be a very slow day for me, so these numbers are a little scary looking. Not that I expect to have a smallish SSD for 10 years, but still...jmke - Friday, November 16, 2012 - link
Using a PC drive 5 years is not a very long time with the IT industry stagnating hardware wise. I would avoid these drives for anything you plan to have work for more than 5 years :)irev210 - Friday, November 16, 2012 - link
jmke, the numbers are based off SMART, not actual, which is conservative.Second, 20GiB of writes a day for 5 years is very aggressive for a typical client workload.
Anyone that is doing that sort of writes is probably doing specialized work that requires something a bit higher-end than your typical office client.
Alexvrb - Friday, November 16, 2012 - link
Exactly. If you are doing something like 30GB+ of WRITES per day, you should have a good MLC drive like the 840 Pro, or even an SLC scratch drive + larger MLC drive.The non-Pro, TLC 840 is strictly a consumer drive, and it will easily last 5-10 years in a typical consumer system.
mvyrmnd - Sunday, April 12, 2020 - link
I'm here in April 2020, still using my 500GB 840 EVO with no issues. Bought it in 2013.Mr_Sp0ck - Wednesday, February 9, 2022 - link
I have a 250GB 840 EVO, and I'm doing 5,924GB Host Writes, with 96% drive life left.I've just replaced it (and a 240GB Intel 530 Series drive), in my main work systems, with dual 500GB WD Blue 3D drives.
So far, the stats for the new WDs seem impressive. I'm doing only 19GB TLC NAND writes, out of a total of 81GB host writes. Thanks to the vast improvements in controller technologies, along with the tier caching technology, that WD implemented in these drives.
I've read where WD claims that, compression might even take place, before 'desired' data is move from SLC cache, to the final TLC destination. It seems that truth has influenced my 0.22x WAF numbers.
I have a total of 8 SSDs by the way. I could post drive usage data, for educational purposes, should anyone desire.
robmuld - Friday, November 16, 2012 - link
I don't understand how you can assume SMART values are worse than actual ones? How many times has a product had a problem when it's "supposed" to be working.This is something Anandtech should test for themselves, rather than take the manufacturer at their word. Do a continuous write test and see how many weeks it takes to fail. I'm pretty sure you'll encounter some surprises.
Kristian Vättö - Friday, November 16, 2012 - link
A premature death is always a possibility but usually that's not due to the NAND endurance.Totally killing the drive is obviously the best way to test endurance but the problem is that it will take weeks, possibly even months to complete and I couldn't test any other SSDs during that period.
robmuld - Friday, November 16, 2012 - link
Just setup a cheap box with SATA3 and let it run. Why does it have to occupy the main test rig? After doing the main benchmarks stick the drive in the torture chamber. It would add a lot of value to reviews.Remember when the stuttering problem was first discovered? It took some extra effort but it was worth it, and drive manufacturers ended up changing their products as a ersult
Sivar - Friday, November 16, 2012 - link
Read the link xtremesystems link and then decide if Anandtech is blindly taking manufacturers on their word.