What is object storage?
Feature: Highly scalable, simple, cheap, distributed storage for the cloud
By the end of 2012, 1.3 trillion objects were stored in Amazon S3, the world's largest and most widely known object storage system. At the time, that number was growing faster than 1 billion objects per day, so the 2 trillion mark is right around the corner.
Object storage is vastly more scalable than traditional file system storage because it's vastly simpler. Instead of organizing files in a directory hierarchy, object storage systems store files in a flat organization of containers (called "buckets" in Amazon S3) and use unique IDs (called "keys" in S3) to retrieve them. The upshot is that object storage systems require less metadata than file systems to store and access files, and they reduce the overhead of managing file metadata by storing the metadata with the object. This means object storage can be scaled out almost endlessly by adding nodes.
Reliability is achieved on ordinary hardware and disk drives by replicating objects across multiple servers and locations. If you set up your own solution, such as with OpenStack Swift, you can configure the number of storage zones and replicas to suit your needs. (OpenStack recommends at least five nodes for a production system.) Amazon promises nine 9s of "durability" for standard Amazon S3, which translates into the loss of one file in 100 billion. If your data protection needs are not that extreme, you can save a few pennies with the Reduced Redundancy Storage option (two 9s of durability).
The features you get in an object storage system are typically minimal. You can store, retrieve, copy, and delete files, as well as control which users can do which, and that's about it. If you want search or a central repository of object metadata that other applications can draw on, you'll generally have to implement it yourself. Amazon S3 and other object storage systems provide REST APIs that allow programmers to work with the containers and objects. SoftLayer is the rare public cloud that provides search of its object storage to users.
Finally, the HTTP interface to object storage systems allows for fast, easy access to files for users from anywhere in the world. (For example, every file in Amazon S3 has a unique URL based on the Amazon location, the name of the bucket, and the name of the file: https://s3-us-west-1.amazonaws.com/objectstorage1/object_storage.rtf.) You'll wait longer than you would accessing a file from NAS, of course, but you can't beat the convenience.
In addition to the significantly slower throughput, compared to a traditional file system, the other big drawback of object storage is that data consistency is achieved only eventually. Whenever you update a file, you may have to wait until the change is propagated to all of the replicas before requests will return the latest version. This makes object storage unsuitable for data that changes frequently. But it's a great fit for all the data that doesn't change much, like backups, archives, video and audio files, and virtual machine images.
*******************************************
* Another good paper to learn about object storage *
*******************************************
http://www.storage-switzerland.com/Blog/Entries/2012/11/30_What_Is_Object_Storage.html
This past week Storage Switzerland and other industry influencers joined Cleversafe, Data Direct Networks, Intel, Nexsan, Quantum, and Scality at the Next Generation Object Storage Summit in Florida. The first order of business was to try to explain what Object Storage is.
What is Object Storage?
Object Storage is a technology where data is stored in self-contained entities called objects. Think of an object as a file. But unlike traditional file systems, this storage method is not dependent on a hierarchical layout of directories and sub-directories. Objects are given unique ID numbers which are managed in a ‘flat’ index reducing significantly the amount of metadata (data about data) needed to store and retrieve a file.
Storage systems that have an object storage foundation can deliver a series of capabilities that should be of interest to data centers that need to store large amounts of files or objects. The classic example is an internet-based business that provides image sharing or file sharing, but these capabilities should appeal to the larger enterprise IT space as well.
Do You Need Object Storage?
The traditional file system or NAS has historically served, and continues to serve, its primary purpose in the data center - high performance file sharing. But as the number of files stored on these systems continues to increase these file systems can become bogged down handling metadata.
An increasing number of data centers are simply adding more NAS heads, not because they are out of capacity but because NAS performance is being impacted by metadata management issues. Object storage does not face this problem since metadata is contained in the object itself, removing the metadata management burden from file system.
A second challenge that the traditional NAS faces is making sure that the data it stores remains valid. Hard drive media can degrade over time and ‘bit rot’ can occur. Object storage, through the use of the unique ID method described above, can provide a continuous protection against this kind of silent corruption. Most of these systems create the unique ID based on the contents of the object and then recalculate it periodically comparing it to the original ID. If its unique ID changes it means the object’s data has been changed, which would indicate corruption if that data was not purposely modified. These object storage systems typically have a method for replacing the corrupted file with a known good copy.
This ability to provide data durability also means that less investment has to be made in data protection. The storage system does not need an elaborate RAID protection algorithm nor do its administrators need to suffer through long RAID rebuild cycles.
A third challenge with a high file count NAS is back up. When file counts number in the hundreds of millions the time it takes any backup software product to walk those files to determine which ones need to be protected can take far more time than the backup window will allow. With an object based storage system it can leverage the unique IDs to make sure that there are always copies of each object available on-site and off-site.
A fourth challenge with legacy NAS is the difficulty of creating an environment that locks down data so that it can't be changed or so that all iterations of a file can be tracked separately. Object storage solves this problem by once again leveraging the unique ID. The system ensures that once an object’s ID has been created it can't be changed or in some cases even deleted.
A final challenge is the geographic dispersion of data. The work force today is very mobile and for online services users need data access worldwide. The ability to have data automatically stored in multiple locations based on policy is an important requirement for some data centers. Object storage systems will vary on how they accomplish this but once again they leverage objects IDs to make it happen.
Storage Swiss Take
Object storage is not a new technology; it has been around in various forms for decades. The reason it’s moving back onto the radar screens of data center professionals is that it solves the problems that they are facing today or are getting ready to face, specifically those associated with high file count NAS environments.
Beyond just scalability it also provides key data resilience features that enable companies to maintain data integrity. For organizations that are creating massive amounts of data, data that they can monetize, and where part of the value is in storing that data for a long period of time, object storage can be an attractive option.
The explosion of unstructured data and the emergence of Big Data Archiving has created use cases where organizations have almost unlimited capacity requirements but still need good retrieval performance. Traditional NAS file storage can’t scale large enough and the deep archives that relied on tape libraries or ‘capacity disk’ arrays don’t have the performance. Object-based storage systems, like Cleversafe, are providing an answer.
Cleversafe leverages a scale-out architecture of storage, management nodes and access nodes to create a storage system that scales almost without limit. (“Limitless scale” is the company’s tagline.) “Slicestor” storage nodes are available in 2U and 4U form factors, providing up to 135TB of raw capacity per node. Using a unique information dispersal algorithm it distributes data objects across nodes located in a single data center or geographically distributed. Cleversafe is also available as a software solution that runs on qualified hardware or as a virtual machine.
Using a proprietary erasure coding process and tier-one data encryption Cleversafe can create a secure, redundant infrastructure. Its ability to scale efficiently (without creating multiple copies) makes it ideal for Big Data environments or internet-based businesses with very large potential capacity requirements, such as photo storage websites. Shutterfly is a Cleversafe client with over 50PB deployed and is in the process of doubling that capacity.
Cleversafe has shipped over 170PB of storage to some very large organizations in the federal space, healthcare, media and communications and internet-related industries. Their primary market is users that may have an unlimited potential for data growth but definitely not an unlimited budget to buy storage. They need very large capacity archives that can still provide the retrieval performance of online storage, not ‘next day delivery’. For many that interact with their customers over the internet, maintaining low latency in those transactions is key to keeping those customers.
Some also need storage that’s highly reliable and secure, an option that Cleversafe can support as well. Its patented AONT technology integrates encryption keys within the dispersed data objects, assuring that if data is compromised data it’s still unreadable and eliminating the need for key management in the process.
Storage Swiss Take
Object storage is a technology that’s clearly taken off in 2012. Cleversafe is a relatively small company, compared with many in this industry, but theirs is not an installed base of beta customers or “$1 over cost” non-profits that paid little or nothing to become part of the development effort. They’ve been commercially deployed since 2008 and have over 2 dozen revenue customers, most in the multiple PB range. In the object storage space they’re an established supplier.

浙公网安备 33010602011771号