AWS Storage Blog

Bring Amazon S3 closer to the edge with Nasuni

Customers in Manufacturing, Real Estate, Engineering and Construction, Healthcare and other related industries often have remote facilities with limited network bandwidth where large amounts of data generated at the edge needs to be stored, processed and analyzed for making decisions in real time. For example, applications running in manufacturing plants need low latency access to fast storage to meet productivity targets, where critical decisions such as increasing or decreasing production capacity are made based on analysis of incoming data. For these edge use cases with stringent low latency storage requirements, using local storage may seem like a viable option. However, this approach can lead to creation of isolated data silos that are hard to manage and maintain with questionable data durability.

Amazon Simple Storage Service (S3), with its industry-leading scalability, data availability, security and performance, is often a better choice over local storage. However, since Amazon S3 is cloud-based storage solution accessible over the internet, using it in environments with slow and variable network conditions can result in inconsistent throughput. To solve for this, Nasuni, an AWS Storage Competency Partner, offers a cache-based storage solution, called Nasuni Edge for Amazon S3, that brings Amazon S3 closer to the edge locations. It supports a subset of Amazon S3 APIs that enables applications to read and write to the cache using an Amazon S3 compatible protocol, where hot data is cached locally at the edge, with all data stored centrally in a regional Amazon S3 bucket.

In this blog post, we explore the architecture, benefits and use cases of Nasuni Edge for Amazon S3. This solution enables customers to modernize their edge applications with low latency storage requirements to use cloud storage.

Nasuni Edge for Amazon S3

Nasuni offers a cache-based storage solution that customers use to replace on-premise file systems with a modern cloud-based file system, with support for popular file protocols including CIFS and NFS. It caches frequently accessed data at the edge, while securely storing all data on Amazon S3. Caching data at the edge can drastically improve performance and lower costs by reducing the amount of traffic that has to traverse the public internet before going back into the private network.

On AWS Pi Day 2024, Nasuni launched Nasuni Edge for Amazon S3, a new enhancement to its cache-based solution that adds support for a subset of Amazon S3 APIs. With this enhancement, applications can use Amazon S3 APIs to write and read data from the cache. Customers gain access to a PB-scale global namespace across AWS Regions, AWS Outposts, AWS Local Zones, and on-premises environments with data stored centrally and durably on Amazon S3. Additionally, it offers multi-protocol access, where data can be written using Amazon S3 compatible protocol and read using CIFS/NFS and vice versa on the same dataset.

Support for Amazon S3 protocol also makes it easier for customers to analyze their data with a number of cloud and on-prem based AI/ML services, where these services already support S3 protocol for data ingestion. For example, customers can connect their data repositories with AWS AI/ML services like Amazon Kendra that provides enterprise level search capabilities, Amazon Rekognition for video and photo analysis, Amazon Textract for extracting text from documents and others for powerful data analytics capabilities.

Nasuni architecture

With Nasuni’s global file system, all data is consolidated and stored on Amazon S3. Copies of frequently used files and objects are cached locally on Nasuni edge appliances, which are lightweight virtual machines (VMs) that replace traditional on-premises file servers and NAS devices.

You can deploy edge appliances on physical, virtual and cloud infrastructure

The preceding figure showcases how you can deploy edge appliances on physical, virtual and cloud infrastructure. This makes it easy to deploy them anywhere, such as in data centers and remote sites or on Amazon Elastic Compute Cloud (Amazon EC2) instances running in an AWS Region, Outposts, or Local Zones. Data, hosted on virtualized containers called Nasuni volumes, is stored centrally on Amazon S3 and is accessible across all Nasuni edge appliances. It is periodically synchronized with hot data stored at individual edges and the gold copy is maintained in the cloud. As a result, data is available at multiple locations, providing multi-site access and eliminating costly and cumbersome replication schemes and the need for slow WAN optimization. Nasuni’s architecture is horizontally scalable, where additional appliances can be deployed to handle performance and capacity requirements. All data stored in Nasuni is encrypted and compressed for improving data security and lowering storage costs.

Nasuni Edge for Amazon S3 enables modern applications that use Amazon S3 protocol to use Nasuni as a high-performance storage system that runs closer to the edge while providing durability and regional resiliency, as all data is stored in Amazon S3. Customers can selectively enable Nasuni Edge for Amazon S3 capability at the edge appliance level. Once enabled, Nasuni implements a service that “talks” Amazon S3 where buckets can be created on the Nasuni volumes so that clients can read and write data using access and secret keys. Amazon S3 clients such as Cyberduck and libraries such as boto3 talk to the Nasuni S3 service and read or write to the buckets on a Nasuni volume. Similarly, CIFS or NFS clients can read and write data to the same bucket that is accessible through a share or an export.

Nasuni Edge continuously tracks changes by creating periodic snapshots, which are then compared at defined snapshot intervals. Any changes detected are written to the cloud, ensuring that data is always up-to-date.

Use cases for Nasuni Edge for Amazon S3

The following are some of the top use cases for Nasuni Edge for Amazon S3 across industries.

Manufacturing

Customers in manufacturing industries generate a large amount of data at their manufacturing sites, which is typically ingested into a file system using CIFS protocol. As ingest speed directly affects the productivity of these manufacturing plants, customers are always looking for faster ways to capture this data. Certain legacy applications that are used at these sites for defect analysis are limited to writing data over CIFS, which is often slow, especially when it comes to dealing with a large number of small files. Another use case for these customers is to leverage modern AI/ML tools to analyze this data for gaining deeper insights.

Nasuni’s cache-based solution provides localized, low-latency access to data at the edge for manufacturing sites while maintaining the gold copy in an Amazon S3 Region. Defect analysis is accelerated by quickly ingesting large diagnostic data sets using the Amazon S3 protocol. This improves performance and manageability and allows them to modernize their workload processes by moving from CIFS to Amazon S3. Using Nasuni’s multi-protocol capability makes defect data available to engineers using CIFS or through modern applications using Amazon S3 on the backend. Nasuni, using the Analytics connector feature, makes data available to AWS AI/ML services by making a copy of the data to help customers further automate processes and get insights into their data.

Healthcare

Healthcare customers have a need to accelerate the sharing of large instrument data for faster diagnosis of patient conditions at various locations. Many healthcare customers store data on private cloud storage. However, they also have a need for DR in the AWS cloud.

Nasuni’s cache-based solution provides low-latency access where large instrument data can be ingested locally using Amazon S3 for faster diagnosis and read by medical personnel worldwide using multisite collaboration. The AWS Cloud DR solution enables meeting demanding RPO/RTO requirements, reducing disruption for access to instrument data.

Software development

Software Development companies are looking to optimize their around-the-clock development and test software workflows to ingest increasingly large builds quickly. They want to reduce data duplication across AWS Regions by avoiding expensive replication. They also need multi-protocol access to integrate with legacy applications.

Nasuni’s edge caching solution enables faster performance of software build distribution. Nasuni’s multi-site access, which makes data available for reading and writing in multiple locations, reduces data duplication across AWS Regions by avoiding expensive replication. Multi-protocol access through both Amazon S3 and CIFS makes the same data set available to legacy and modern applications.

Media and entertainment

Customers in the media and entertainment industries are looking to improve the upload and download speeds of hi-res images and video files for movie production. Creators are global and need to have the files available locally for viewing, collaborating, and editing the video files.

Nasuni’s edge caching enables quick ingest of video files by using Amazon S3 protocol. Local data capture with faster transfer speeds accelerates collaboration. Nasuni enables centralization of content storage, processing, creation, and distribution of media content improving upload and download speeds.

Conclusion

In this blog, we covered how Nasuni Edge for Amazon S3 provides a unique storage solution for edge use cases that have strong low latency storage requirements but need durability, resiliency, security and scalability of the cloud. The solution brings Amazon S3 closer to the edge, caching hot data on Nasuni Edge appliances while storing all data centrally and durably on Amazon S3.

Nasuni’s flexible architecture allows simultaneous access to the same dataset through CIFS, NFS, and Amazon S3 protocols, allowing users to not only use object storage, but also to provide file system semantics for their legacy applications. In addition, Nasuni’s architecture is horizontally scalable, where performance and capacity can be increased by adding additional edge appliances. Finally, support for Amazon S3 protocol simplifies integration with AWS AI/ML services that ingest data using S3.

Nasuni Edge for Amazon S3 solution is available beginning with the latest 9.14.x release. To learn more about it, visit Nasuni Edge for Amazon S3.

Sheetal Kochavara

Sheetal Kochavara

Sheetal Kochavara is the Director of Product Management at Nasuni, leading a team of product managers and engineers to deliver innovative and scalable cloud-based solutions for enterprise data storage and protection. She has over 14 years of experience in storage product management, working for companies including EMC and VMWare.

Girish Chanchlani

Girish Chanchlani

Girish Chanchlani is a Principal Partner Solutions Architect at AWS and is a member of the Amazon Partner Network (APN) team that works closely with ISV Storage Partners. Prior to AWS, his experience includes working for data and storage management companies as a Product Manager covering File Systems, NAS, Media Management, and Data Protection Appliance solutions.

May Olatoye

May Olatoye

May Olatoye is a Senior Technical Product Manager on the Amazon S3 team. She is passionate about enabling customers to unlock the potential of their data and leverage it as a differentiating advantage to drive generative AI innovation. Based in NYC, May enjoys traveling and exploring new cultures through food and art in her free time.