Troubleshooting
Table of Contents
- Default EBS encryption with a customer-managed KMS key Elastio does not have permission to use
- VPC misconfigurations
- Enable VSS for an AWS EC2 instance
- Wrong AWS account or region
- Cloud Connector is not present in the selected region
- The assets are not displayed in the Elastio Tenant
- A default vault is missing
- Integrity scan fails
- Deploying the Elastio CFN in the wrong AWS account
- EC2 backup fails
- Block mount fails
In this section you can find the most frequent failure modes and how to work around them.
Default EBS encryption with a customer-managed KMS key Elastio does not have permission to use
EBS volume backup fails if it is encrypted with a customer-managed KMS key that Elastio does not have access to. This also affects the EC2 backup in case one or more of its volumes are encrypted in a similar way.
This manifests as the following error message you can see in your Tenant on Jobs page or in Elastio CLI:
"Ebs operation has failed: Failed to access volume. Most likely there is a problem with encrypted volume and permissions. Check permissions for KMS key: {KMS key ARN}"
It happens because Elastio’s background job task roles don’t have permission to use the KMS key used for the EBS encryption. It might occur if there is an encryption policy for all newly created EBS volumes or if a single volume was created with a KMS key that Elastio cannot access.
In that case, you need to locate the KMS key used for encryption in your AWS console. In the “Key users” section press the “Add” button.
Figure 1.1 Key users
In the list search for the 2 roles beginning with:
elastio-account-level-stack-ebsBgJobs
elastio-account-level-stack-ec2BgJobs
Add both to the key users.
An example of JSON representation of these permissions is as follows (NOTE: The actual Elastio IAM role names will have a unique randomly generated suffix in each AWS account, and the AWS account ID corresponding to that, where Elastio is deployed, so make sure to replace the example role names below with the actual EBS and EC2 background job roles):
{
"Sid": "Allow Elastio backup and restore jobs use of the key",
"Effect": "Allow",
"Principal": {
"AWS": [
"arn:aws:iam::aws_account_id:role/elastio-account-level-stack-ebsBgJobs%random_suffix%",
"arn:aws:iam::aws_account_id:role/elastio-account-level-stack-ec2BgJobs%random_suffix%"
]
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
}
An example KMS key policy will look like so:
{
"Id": "key-consolepolicy-3",
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Enable IAM User Permissions",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::*:root"
},
"Action": "kms:*",
"Resource": "*"
},
{
"Sid": "Allow access for Key Administrators",
"Effect": "Allow",
"Principal": {
},
"Action": [
"kms:Create*",
"kms:Describe*",
"kms:Enable*",
"kms:List*",
"kms:Put*",
"kms:Update*",
"kms:Revoke*",
"kms:Disable*",
"kms:Get*",
"kms:Delete*",
"kms:TagResource",
"kms:UntagResource",
"kms:ScheduleKeyDeletion",
"kms:CancelKeyDeletion"
],
"Resource": "*"
},
{
"Sid": "Allow use of the key",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::*:role/elastio-account-level-stack-ebsBgJobs$RANDOM_SUFFIX",
"AWS": "arn:aws:iam::*:role/elastio-account-level-stack-ec2BgJobs$RANDOM_SUFFIX"
},
"Action": [
"kms:Encrypt",
"kms:Decrypt",
"kms:ReEncrypt*",
"kms:GenerateDataKey*",
"kms:DescribeKey"
],
"Resource": "*"
},
{
"Sid": "Allow attachment of persistent resources",
"Effect": "Allow",
"Principal": {
"AWS": "arn:aws:iam::*:role/elastio-account-level-stack-ebsBgJobs$RANDOM_SUFFIX"
},
"Action": [
"kms:CreateGrant",
"kms:ListGrants",
"kms:RevokeGrant"
],
"Resource": "*",
"Condition": {
"Bool": {
"kms:GrantIsForAWSResource": "true"
}
}
}
]
}
VPC misconfigurations
Deploying a vault in a private VPC without an Internet Gateway for all subnets in the VPC with “auto-assign public IPv4” address enabled results in the AWS Batch job for the backup getting stuck in Runnable state. An error message isn’t delivered.
Currently Elastio shows the VPC misconfiguration warning upon deployment if the compute environment of the background job has a subnet with public_ip_on_launch
toggle disabled. If the subnet has public_ip_on_launch
enabled, the VPC misconfiguration warning is not displayed. Additionally, Amazon ECS (and AWS Batch respectively) doesn’t work as expected if the public subnet doesn’t have IGW interface attached to the subnet and a routing rule for the IGW.
Note: The VPC misconfiguration warning displayed is The Cloud Connector will be deployed into a private subnet and will need manual configuration. Please see documentation for more details. One of our cloud team members will reach out with assistance.
The warning can be safely ignored if the subnets you deploy the vault into have a NAT gateway with access to the Internet, in particular access to all AWS API endpoints.
To resolve the networking issues, create a private subnet with a NAT gateway. The instructions below will help you to do that:
- Create a public NAT in the public subnet.
- Go to the private subnet route table and route
0.0.0.0/0
through the NAT. - The public subnet route table should have a route
0.0.0.0/0
to IGW interface.
Note: The selected VPC should have a subnet with “auto-assign public IPv4” address enabled, as well as a subnet in every Availability Zone within the region. If you plan to use Elastio only within the VPC, deploy the vault entirely in private subnets with NAT gateways. In this case the vault will only be accessed from within the private subnet. Mount backups from your workstations will become possible only after setting up a VPN tunnel into the VPC with a network path from the VPN tunnel to the private subnets, where the vault is running.
Enable VSS for an AWS EC2 instance
Without the VSS being enabled for the Windows instances, only crash-consistent backups can be created. This kind of backup can create snapshots with inconsistent data on the systems under heavy load, so the data cannot be properly scanned for threats or restored from such recovery points. To avoid such risks it is recommended to enable the VSS in AWS. This instruction will walk you through the necessary steps to configure VSS and enable it on the instances you need it on.
First you’ll need to create a policy with necessary permissions:
- Go to IAM section of the AWS console.
- In the navigation pane, select Policies, and then - Create policy.
- On the Create policy page, choose the JSON tab, and then replace the default content with the following JSON policy.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "ec2:CreateTags",
"Resource": [
"arn:aws:ec2:*::snapshot/*",
"arn:aws:ec2:*::image/*"
]
},
{
"Effect": "Allow",
"Action": [
"ec2:DescribeInstances",
"ec2:CreateSnapshot",
"ec2:CreateImage",
"ec2:DescribeImages",
"ec2:DescribeSnapshots"
],
"Resource": "*"
}
]
}
- Press Review policy.
- For Name, enter
VssSnapshotRole
. - Press Create policy.
Then you’ll need to create a role. To create the role do the following:
- In the navigation pane, select Roles, and then - Create role.
- Under Select type of trusted entity, choose AWS Service.
- Choose EC2, and then choose Next: Permissions.
- In the list of policies, choose the box next to
AmazonSSMManagedInstanceCore
and press Next: Tags. - For Role name, enter the name
VssSnapshotRole
. - Select Create role. The system returns you to the Roles page.
- Select the role that you just created and press Attach policies.
- Search for and choose the box next to the policy you created in the previous procedure
VssSnapshotRole
and press Attach policy.
Alternatively, you can add the AmazonSSMManagedInstanceCore
and VssSnapshotRole
to the role already attached to the instances.
Then you’ll need to install the VSS components to the Windows EC2 instances:
- Open the AWS Systems Manager console.
- In the navigation pane, select Run Command.
- For Command document, choose
AWS-ConfigureAWSPackage
. - For Command parameters, do the following:
- Verify that Action is set to Install.
- For Name, enter
AwsVssComponents
. - For Version, leave the field empty so that Systems Manager installs the latest version.
- For Targets, identify the instances on which you want to run this operation selecting instances manually.
- Press Run.
The VSS should become enabled for the instances that you attach the newly created role to.
Wrong AWS account or region
The following error message is displayed when trying to mount a recovery point with a command from the Elastio Tenant in the incorrectly configured CLI that reaches out to the wrong AWS account or region:
[email protected]:~$ sudo -E elastio mount rp --rp rp-01g4zdj9c8ay1kx08y39dyq426
Recovery point was not found in vault `{any}` for ID `rp-01g4zdj9c8ay1kx08y39dyq426`
Location:
/home/elastio/.cargo/registry/docs/dl.cloudsmith.io-2dc4edbccd98c64c/cheburashka-4.3.2/docs/logging/mod.rs:592:18
To resolve the issue, run aws sts get-caller-identity
to ensure that the account number is the same as the RP’s account number. If those match, run aws configure
and check that the region of the CLI matches the region of the recovery point.
Cloud Connector is not present in the selected region
The following error message is displayed when attempting to perform EBS or EC2 backup if the AWS CLI is configured for another region or AWS account:
[email protected]:~$ elastio ec2 backup --instance-id i-0ea486582654d7562
Failed to fetch data for a default vault
Caused by:
0: Catalog service error
1: Failed to invoke remote function elastio-catalog-service-read
2: Function not found: arn:aws:lambda:us-east-1:421555810956:function:elastio-catalog-service-read
3: Function not found: arn:aws:lambda:us-east-1:421555810956:function:elastio-catalog-service-read
Location:
/home/elastio/.cargo/registry/docs/dl.cloudsmith.io-2dc4edbccd98c64c/cheburashka-4.2.12/docs/logging/mod.rs:588:18
[email protected]:~$
The following error message is displayed when attempting to perform file, block or stream backup in a region that does not have a Cloud Connector installed:
[email protected]:~$ elastio file backup ./123.pub
Failed to invoke remote function elastio-jobs-status-service
Caused by:
0: Function not found: arn:aws:lambda:us-east-1:421555810956:function:elastio-jobs-status-service
1: Function not found: arn:aws:lambda:us-east-1:421555810956:function:elastio-jobs-status-service
Location:
/home/elastio/.cargo/registry/docs/dl.cloudsmith.io-2dc4edbccd98c64c/cheburashka-4.2.12/docs/logging/mod.rs:588:18
Run elastio version
to ensure that Cloud Connector is deployed in the current region. Run aws sts get-caller-identity
to check if the account configured for the AWS CLI is correct. If the region is wrong, change it by running aws configure
. In case the wrong account is reached, reconfigure the AWS CLI with the correct keys or profile.
The assets are not displayed in the Elastio Tenant
After logging into an Elastio Tenant and connecting an AWS account, no assets are displayed on the Assets page of the Tenant.
Make sure that there are some EC2 and/or EBS assets in the dedicated account and region - if there are none in the AWS account none will be displayed through Elastio.
A default vault is missing
When attempting to run a backup or a restore through the Elastio CLI, the following error is displayed:
Failed to find default vault. Specify '--vault' option or set the default vault and run again
Location:
/home/elastio/.cargo/registry/docs/dl.cloudsmith.io-2dc4edbccd98c64c/cheburashka-4.4.0/docs/logging/mod.rs:593:18
Check if the default vault is set by running elastio vault default
. If the output is empty, run elastio vault list
pick a vault and run elastio vault default vault-name
to set some vault as default. Alternatively, you can navigate to this account through Sources page in Elastio Tenant, select a vault and set it as default. In case the vault list comes back empty, create a vault through Sources page in your Tenant.
Note: For the time being, please limit the number of vaults to maximum 7 per region.
Note: You can skip using the --vault
flag when you set your default vault. It is no longer necessary to add the vault name every time you run a command that requires the vault. The system will use the default vault if none is specified.
Integrity scan fails
The following error message is displayed when attempting to run iscan
against the recovery point that cannot be mounted as the target EBS volume does not have a filesystem.
Figure 1.2: Integrity Scan error message
Ths error is most likely with new EBS volumes created through AWS Console manually that were not formatted. To resolve the issue, attach said EBS volume to an instance and create a filesystem on it.
Deploying the Elastio CFN in the wrong AWS account
When deploying the Elastio CFN in the wrong AWS account, the following error message appears.
Figure 1.3: The Elastio CFN deploy error message
To resolve the issue, switch to the required resource account on the AWS Management Console.
EC2 backup fails
The following error message is displayed when attempting to perform AWS EC2 backup.
Figure 1.4: AWS EC2 backup error message
The same error in the logs is displayed as follows:
{
"timestamp": "2022-05-28T07:03:05.344Z",
"level": "ERROR",
"fields": {
"error": [
"Failed to get volume state, reason Unknown(BufferedHttpResponse {status: 400, body: \"<?xml version=\\\"1.0\\\" encoding=\\\"UTF-8\\\"?>\\n<Response><Errors><Error><Code>InvalidVolume.NotFound</Code><Message>The volume 'vol-05e873f1a99eec441' does not exist.</Message></Error></Errors><RequestID>fe978390-f1f7-4d39-9e0f-b71579561d06</RequestID></Response>\", headers: {\"x-amzn-requestid\": \"fe978390-f1f7-4d39-9e0f-b71579561d06\", \"cache-control\": \"no-cache, no-store\", \"strict-transport-security\": \"max-age=31536000; includeSubDomains\", \"vary\": \"accept-encoding\", \"content-type\": \"text/xml;charset=UTF-8\", \"transfer-encoding\": \"chunked\", \"date\": \"Sat, 28 May 2022 07:03:05 GMT\", \"connection\": \"close\", \"server\": \"AmazonEC2\"} })"
],
"message": "VolumeAwsImpl::state() failed"
},
"target": "elastio_agentless::aws::volume",
"filename": "cli/elastio-agentless/docs/aws/volume.rs",
"line_number": 415,
"spans": [
{
"availability_zone": "us-east-1c",
"instance_type": "",
"op_type": "create_volume_from_snapshot",
"platform": "aws",
"volume_iops": 100,
"volume_size": 0,
"volume_type": "gp3",
"name": "AgentlessLabels"
}
],
"threadId": "ThreadId(6)"
}
The AWS EC2 has a volume encrypted with a key, which Elastio does not have access to. The issue is going to be fixed in the upcoming release.
Block mount fails
When attempting to perform block mount on the supported Linux-based OS without the NBD kernel module installed, the following error message is displayed.
[[email protected] ~]$ sudo -E elastio mount rp --rp rp-01g8e24sp9twa14brpan41bq5f
Unknown: Failed to load `nbd` module. Ensure your Linux kernel is compiled with NBD support.
Caused by:
command ["modprobe", "nbd"] exited with code 1
Location:
/home/elastio/.cargo/registry/docs/dl.cloudsmith.io-2dc4edbccd98c64c/cheburashka-4.4.0/docs/logging/mod.rs:593:18
To resolve the issue, make sure that the nbd-client
package is installed. Furthermore, make sure that the elastio mount
command is executed as root
or with sudo
.