Help Docs

Amazon FSx Monitoring Integration

Amazon FSx is a fully managed service that integrates file systems with cloud-native AWS services, helping you run your file systems. It currently offers the following file systems: Amazon FSx for Windows File Server (for business applications) and Amazon FSx for Lustre (for compute-intensive workloads).

With Site24x7's integration with Amazon FSx, you can obtain the operational metrics of the FSx file systems hosted in your AWS infrastructure. You can also track the file system operations, like data repository tasks and backups.

Setup and configuration

1. If you haven't already, connect your AWS account with Site24x7's AWS account by either:

  • Creating Site24x7 as an IAM user.
  • Creating a cross-account IAM role. Learn more

2. On the Integrate AWS Account page, check the appropriate box for Amazon FSx. Learn more

Policy and permissions

Site24x7 uses various Amazon FSx APIs to collect information about your migration service. Assign the AWS managed policy ReadOnlyAccess to the Site24x7 entity (IAM user or IAM role) to help Site24x7 collect metrics and metadata. If you want to assign a custom policy, please make sure the following read-level actions are present in the policy JSON. Learn more

  • "fsx:ListTagsForResource",
  • "fsx:DescribeBackups",
  • "fsx:DescribeDataRepositoryTasks",
  • "fsx:DescribeFileSystems"

Polling Frequency

Site24x7 queries AWS to collect Amazon FSx performance metrics according to the configured polling frequency. The polling interval is one hour by default. Learn more

IT Automations

You can add automations for the AWS services supported by Site24x7. Log in to Site24x7 and go to Admin > IT Automation Templates (+) > Add Automation Templates. Once automations are added, you can schedule them to be executed one after the other.

You can now create a data repository task or a backup for the file system using Amazon FSx automations.

Performance metrics for file systems

Metric name Description Supported for file system type Statistic Unit
Data Read Bytes Number of bytes for file system read operations. All Sum MB
Data Write Bytes Number of bytes for file system write operations. All Sum MB
Data Write Operations Number of write operations. All Sum Count
Data Read Operations Number of read operations. All Sum Count
Metadata Operations Number of metadata operations. All Sum Count
Free Storage Capacity Amount or percentage of available storage capacity. All Average GB/Percentage
Total Throughput Total throughput of the file system. All Average MB/sec
Read Throughput Read throughput of the file system. All Average MB/sec
Write Throughput Read throughput of the file system. All Average MB/sec
Total IOPS Total number of I/O operations per second. All Average Count/sec
Read IOPS Total number of read I/O operations per second. All Average Count/sec
Write IOPS Total number of write I/O operations per second. All Average Count/sec
Metadata IOPS Total number of metadata I/O operations per second. All Average Count/sec
Client Connections The number of active connections between clients and the file server. Windows and OpenZFS Sum Count
Network Throughput Utilization The percent utilization of network throughput for the file system. All file system types except Lustre Average Percentage
CPU Utilization The percentage utilization of your file server’s CPU resources. All file system types except Lustre Average Percentage
Memory Utilization The percentage utilization of your file server’s memory resources. Windows and OpenZFS Average Percentage
File Server Disk Throughput Utilization The disk throughput between your file server and its storage volumes, as a percentage of the provisioned limit determined by throughput capacity. All file system types except Lustre Average Percentage
File Server Disk Throughput Balance The percentage of available burst credits for disk throughput between your file server and its storage volumes. Valid for file systems provisioned with a throughput capacity of 256 Mbps or less. All file system types except Lustre Average Percentage
File Server DiskIops Utilization The disk IOPS between your file server and storage volumes, as a percentage of the provisioned limit determined by throughput capacity. All file system types except Lustre Average Percentage
File Server DiskIops Balance The percentage of available burst credits for disk IOPS between your file server and its storage volumes. Valid for file systems provisioned with a throughput capacity of 256 Mbps or less. All file system types except Lustre Average Percentage
Disk Read Bytes The number of bytes for read operations that access storage volumes. All file system types except Lustre Sum Bytes
Disk Write Bytes The number of bytes for write operations that access storage volumes. All file system types except Lustre Sum Bytes
Disk Read Operations The number of read operations for the file server accessing storage volumes. All file system types except Lustre Sum Count
Disk Write Operations The number of write operations for the file server accessing storage volumes. All file system types except Lustre Sum Count
Disk Throughput Utilization (HDD only) The disk throughput between your file server and its storage volumes, as a percentage of the provisioned limit determined by the storage volumes. Windows Average Percentage
Disk Throughput Balance (HDD only) The percentage of available burst credits for disk throughput and disk IOPS for the storage volumes. Windows and OpenZFS Average Percentage
Disk IOPS Utilization (SSD only) The disk IOPS between your file server and storage volumes, as a percentage of the provisioned IOPS limit determined by the storage volumes. All file system types except Lustre Average Percentage
Deduplication Saved Storage The amount of storage space saved by data deduplication, if it is enabled. Windows Sum Bytes
Logical Disk Usage The amount of logical data stored (uncompressed). Lustre Sum Bytes
Physical Disk Usage The amount of storage physically occupied by file system data (compressed). Lustre Sum Bytes
File Create Operations The total number of file create operations. Lustre Sum Count
File Open Operations The total number of file open operations. Lustre Sum Count
File Delete Operations The total number of file delete operations. Lustre Sum Count
Stat Operations The total number of stat operations. Lustre Sum Count
Rename Operations The total number of directory renames, whether in-place directory renames or cross directory renames. Lustre Sum Count
Directory Delete Operations The total number of directory delete operations. Lustre Sum Count
Directory Create Operations The total number of directory create operations. Lustre Sum Count
NFS Bad Calls The number of calls rejected by the NFS server remote procedure call (RPC) mechanism. OpenZFS Sum Count
File Server Cache Hit Ratio For OpenZFS: The percentage of cache hits. For Single-AZ 2 (non-HA and HA) file systems, this metric reports the cache hit ratio for both the in-memory (ARC) and NVMe (L2ARC) caches. For Single-AZ 1 (non-HA and HA) file systems, this metric reports only the cache hit ratio for the ARC cache. For ONTAP: The percentage of all read requests that are served by data in the file system's RAM and NVMe caches. A higher percentage means that more reads are served by the file system's read caches. OpenZFS and ONTAP Average Percentage
Compression Ratio The ratio of compressed storage usage to uncompressed storage usage. OpenZFS Average Ratio
Storage Efficiency Savings The bytes saved from storage efficiency features (compression, deduplication, and compaction). ONTAP Sum Bytes
Logical Data Stored The total amount of logical data stored on the file system, considering both the SSD tier and the capacity pool tier. This metric includes the total logical size of snapshots and FlexClones but does not include storage efficiency savings achieved through compression, compaction, and deduplication. ONTAP Sum Bytes
Network Sent Bytes The number of bytes (network I/O) sent by the file system. ONTAP Sum Bytes
Network Received Bytes The number of bytes (network I/O) received by the file system. ONTAP Sum Bytes
Data Read Operation Time The sum of total time spent within the file system for read operations (network I/O) from clients accessing data in the file system. ONTAP Sum Bytes
Data Write Operation Time The sum of total time spent within the file system for fulfilling write operations (network I/O) from clients accessing data in the file system. ONTAP Sum Bytes
Capacity Pool Read Bytes The number of bytes read (network I/O) from the file system's capacity pool tier. ONTAP Sum Bytes
Capacity Pool Write Bytes The number of bytes written (network I/O) to the file system's capacity pool tier. ONTAP Sum Bytes
Capacity Pool Read Operations The number of read operations (network I/O) from the file system's capacity pool tier. This translates to a capacity pool read request. ONTAP Sum Count
Capacity Pool Write Operations The number of write operations (network I/O) to the file system from the capacity pool tier. This translates to a write request. ONTAP Sum Count
Storage Capacity Utilization The percent utilization of storage capacity for the file system. All Average Percentage
Storage Used The total storage capacity used for the file system in GB. All Sum Bytes
Read Operations The average data read operation time per data read operation. ONTAP Average Seconds
Write Operations The average data write operation time per data write operation. ONTAP Average Seconds
Metadata Operations The average time taken per meta data operation. ONTAP Average Seconds
Capacity Pool Tier The used physical storage capacity in bytes, specific to the storage tier. This value includes savings from storage-efficiency features, such as data compression and deduplication. With StorageTier as StandardCapacityPool ONTAP Average Bytes
Primary Tier Capacity The storage capacity for all data types with storage tier as SSD. ONTAP Average Bytes
Primary Tier Used The used physical storage capacity in bytes, specific to the storage tier. This value includes savings from storage-efficiency features, such as data compression and deduplication. With StorageTier as SSD, this metric measures the logical space usage for this volume for your SSD. ONTAP Average Bytes
Primary Tier Avail The available or unused physical storage capacity in bytes, specific to the storage tier. ONTAP Average Bytes
Metadata Operation Time The total time taken in meta data operation. ONTAP Sum Seconds
Available Volumes The number of available volumes. OpenZFS and ONTAP Sum Count
Failed Volumes The number of failed volumes. OpenZFS and ONTAP Sum Count
Misconfigured Volumes The number of misconfigured volumes. OpenZFS and ONTAP Sum Count
Created Volumes The number of created volumes. OpenZFS and ONTAP Sum Count
Available SVM The number of available SVM (Support Vector Machine). ONTAP Sum Count
Failed SVM The number of failed SVM ONTAP Sum Count
Misconfigured SVM The number of misconfigured SVM. ONTAP Sum Count
Total Volumes The total number of volumes in the file system. OpenZFS and ONTAP Sum Count
Total SVM The total number of storage virtual machines in the file system. ONTAP Sum Count
No Data Compression OpenZFS Volume The method used to compress the data on the volume can be NONE | ZSTD | LZ4. This metric shows the number of volumes that use no compression method. OpenZFS Sum Count
Zstandard (ZSTD) Compression OpenZFS Volume The number of volumes that use the Zstandard (ZSTD) compression algorithm to compress the data on the volume. OpenZFS Sum Count
LZ4 Compression OpenZFS Volume The number of volumes that use the LZ4 compression algorithm to compress the data on the volume. OpenZFS Sum Count
Clone Volume The number of volumes that reference the data in the origin snapshot, i.e. that uses the clone strategy when copying data from the snapshot to the new volume. OpenZFS Sum Count
Full Copy Volume The number of volumes which copies all data from the snapshot to the new volume i.e. that uses full-copy strategy when copying data from the snapshot to the new volume. OpenZFS Sum Count
Incremental Copy OpenZFS Volume The number of volumes that use an incremental copy strategy when copying data from the snapshot to the new volume. This option is only for updating an existing volume by using a snapshot from another FSx for the OpenZFS file system. OpenZFS Sum Count

Performance metrics for data repository tasks

Attribute Description Statistic Data type
Succeeded Count Number of files successfully exported. Sum Count
Failed Count Number of files that failed to export. Sum Count
Total Count Total number of files to export. Sum Count

Forecast

Estimate future values of the following performance metrics and make informed decisions about adding capacity or scaling your AWS infrastructure.

  • Data Read Bytes
  • Data Write Bytes
  • Data Write Operations
  • Data Read Operations
  • Metadata Operations

Site24x7's Amazon FSx monitoring interface

Summary

Gain an overview of the different events occurring within each FSx file system with time series charts. This section provides you with operational information on data read operations, data write operations, metadata operations, throughput, read or write bytes, IOPS usage, and more.

Data Repository Tasks

All the metadata related to repository tasks is listed here. This includes information like the task ID, status of the task, life cycle state, failure reason (if any), and time stamps of task creation, start time, and end time. The Action column lets you set up alerts or add an automation in case the data repository task is down.

Backup Details

The backup details carried out for any FSx file system will be listed here. This includes information about the backup, like the time, type, ID, state of the backup life cycle, KMS key ARN, and Active Directory ID. If you want to delete the monitoring setup for a particular backup, just click the delete option next to each backup task.

Outages

The Outages tab shows the history of your file systems’ various states, like down, trouble, critical, or maintenance. It also provides details on the start and end time of an outage, its duration, and comments (if any). You can also manually add an outage and edit or delete the comments in this same section.

Log Report

Here you can view the audit log data for an FSx file system, along with details on the timestamp, status, data read bytes, data write bytes, and data read/write operations.

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!