Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

btrfs doesn't report transfer progress on both sides of the migration #676

Closed
phol opened this issue Mar 27, 2024 · 14 comments
Closed

btrfs doesn't report transfer progress on both sides of the migration #676

phol opened this issue Mar 27, 2024 · 14 comments
Assignees
Labels
Bug Confirmed to be a bug Easy Good for new contributors
Milestone

Comments

@phol
Copy link

phol commented Mar 27, 2024

Required information

  • Distribution: Debian
  • Distribution version: 12

Incus info of local machine

config:
  acme.agree_tos: "true"
  core.https_address: :8443
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
- auth_tls_jwt
- oidc_claim
- device_usb_serial
- numa_cpu_balanced
- image_restriction_nesting
- network_integrations
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: beheer
auth_user_method: unix
environment:
  addresses:
  - 10.0.0.136:8443
  - '[redacted]:8443'
  - 10.0.0.91:8443
  - '[redacted]:8443'
  - 10.10.88.1:8443
  - '[fdcc:b486:4ec1::1]:8443'
  - 10.10.99.1:8443
  - '[fd9a:57b8:36a5::1]:8443'
  architectures:
  - aarch64
  - armv6l
  - armv7l
  - armv8l
  certificate: |
    -----BEGIN CERTIFICATE-----
    REMOVED
    -----END CERTIFICATE-----
  certificate_fingerprint: REMOVED
  driver: lxc
  driver_version: 5.0.3
  firewall: nftables
  kernel: Linux
  kernel_architecture: aarch64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "false"
    unpriv_fscaps: "true"
  kernel_version: 6.1.0-18-arm64
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Debian GNU/Linux
  os_version: "12"
  project: default
  server: incus
  server_clustered: false
  server_event_mode: full-mesh
  server_name: hermes
  server_pid: 3790551
  server_version: "0.7"
  storage: btrfs
  storage_version: "6.2"
  storage_supported_drivers:
  - name: btrfs
    version: "6.2"
    remote: false
  - name: dir
    version: "1"
    remote: false

Incus info of remote

config:
  core.https_address: '[::]:8443'
api_extensions:
- storage_zfs_remove_snapshots
- container_host_shutdown_timeout
- container_stop_priority
- container_syscall_filtering
- auth_pki
- container_last_used_at
- etag
- patch
- usb_devices
- https_allowed_credentials
- image_compression_algorithm
- directory_manipulation
- container_cpu_time
- storage_zfs_use_refquota
- storage_lvm_mount_options
- network
- profile_usedby
- container_push
- container_exec_recording
- certificate_update
- container_exec_signal_handling
- gpu_devices
- container_image_properties
- migration_progress
- id_map
- network_firewall_filtering
- network_routes
- storage
- file_delete
- file_append
- network_dhcp_expiry
- storage_lvm_vg_rename
- storage_lvm_thinpool_rename
- network_vlan
- image_create_aliases
- container_stateless_copy
- container_only_migration
- storage_zfs_clone_copy
- unix_device_rename
- storage_lvm_use_thinpool
- storage_rsync_bwlimit
- network_vxlan_interface
- storage_btrfs_mount_options
- entity_description
- image_force_refresh
- storage_lvm_lv_resizing
- id_map_base
- file_symlinks
- container_push_target
- network_vlan_physical
- storage_images_delete
- container_edit_metadata
- container_snapshot_stateful_migration
- storage_driver_ceph
- storage_ceph_user_name
- resource_limits
- storage_volatile_initial_source
- storage_ceph_force_osd_reuse
- storage_block_filesystem_btrfs
- resources
- kernel_limits
- storage_api_volume_rename
- network_sriov
- console
- restrict_dev_incus
- migration_pre_copy
- infiniband
- dev_incus_events
- proxy
- network_dhcp_gateway
- file_get_symlink
- network_leases
- unix_device_hotplug
- storage_api_local_volume_handling
- operation_description
- clustering
- event_lifecycle
- storage_api_remote_volume_handling
- nvidia_runtime
- container_mount_propagation
- container_backup
- dev_incus_images
- container_local_cross_pool_handling
- proxy_unix
- proxy_udp
- clustering_join
- proxy_tcp_udp_multi_port_handling
- network_state
- proxy_unix_dac_properties
- container_protection_delete
- unix_priv_drop
- pprof_http
- proxy_haproxy_protocol
- network_hwaddr
- proxy_nat
- network_nat_order
- container_full
- backup_compression
- nvidia_runtime_config
- storage_api_volume_snapshots
- storage_unmapped
- projects
- network_vxlan_ttl
- container_incremental_copy
- usb_optional_vendorid
- snapshot_scheduling
- snapshot_schedule_aliases
- container_copy_project
- clustering_server_address
- clustering_image_replication
- container_protection_shift
- snapshot_expiry
- container_backup_override_pool
- snapshot_expiry_creation
- network_leases_location
- resources_cpu_socket
- resources_gpu
- resources_numa
- kernel_features
- id_map_current
- event_location
- storage_api_remote_volume_snapshots
- network_nat_address
- container_nic_routes
- cluster_internal_copy
- seccomp_notify
- lxc_features
- container_nic_ipvlan
- network_vlan_sriov
- storage_cephfs
- container_nic_ipfilter
- resources_v2
- container_exec_user_group_cwd
- container_syscall_intercept
- container_disk_shift
- storage_shifted
- resources_infiniband
- daemon_storage
- instances
- image_types
- resources_disk_sata
- clustering_roles
- images_expiry
- resources_network_firmware
- backup_compression_algorithm
- ceph_data_pool_name
- container_syscall_intercept_mount
- compression_squashfs
- container_raw_mount
- container_nic_routed
- container_syscall_intercept_mount_fuse
- container_disk_ceph
- virtual-machines
- image_profiles
- clustering_architecture
- resources_disk_id
- storage_lvm_stripes
- vm_boot_priority
- unix_hotplug_devices
- api_filtering
- instance_nic_network
- clustering_sizing
- firewall_driver
- projects_limits
- container_syscall_intercept_hugetlbfs
- limits_hugepages
- container_nic_routed_gateway
- projects_restrictions
- custom_volume_snapshot_expiry
- volume_snapshot_scheduling
- trust_ca_certificates
- snapshot_disk_usage
- clustering_edit_roles
- container_nic_routed_host_address
- container_nic_ipvlan_gateway
- resources_usb_pci
- resources_cpu_threads_numa
- resources_cpu_core_die
- api_os
- container_nic_routed_host_table
- container_nic_ipvlan_host_table
- container_nic_ipvlan_mode
- resources_system
- images_push_relay
- network_dns_search
- container_nic_routed_limits
- instance_nic_bridged_vlan
- network_state_bond_bridge
- usedby_consistency
- custom_block_volumes
- clustering_failure_domains
- resources_gpu_mdev
- console_vga_type
- projects_limits_disk
- network_type_macvlan
- network_type_sriov
- container_syscall_intercept_bpf_devices
- network_type_ovn
- projects_networks
- projects_networks_restricted_uplinks
- custom_volume_backup
- backup_override_name
- storage_rsync_compression
- network_type_physical
- network_ovn_external_subnets
- network_ovn_nat
- network_ovn_external_routes_remove
- tpm_device_type
- storage_zfs_clone_copy_rebase
- gpu_mdev
- resources_pci_iommu
- resources_network_usb
- resources_disk_address
- network_physical_ovn_ingress_mode
- network_ovn_dhcp
- network_physical_routes_anycast
- projects_limits_instances
- network_state_vlan
- instance_nic_bridged_port_isolation
- instance_bulk_state_change
- network_gvrp
- instance_pool_move
- gpu_sriov
- pci_device_type
- storage_volume_state
- network_acl
- migration_stateful
- disk_state_quota
- storage_ceph_features
- projects_compression
- projects_images_remote_cache_expiry
- certificate_project
- network_ovn_acl
- projects_images_auto_update
- projects_restricted_cluster_target
- images_default_architecture
- network_ovn_acl_defaults
- gpu_mig
- project_usage
- network_bridge_acl
- warnings
- projects_restricted_backups_and_snapshots
- clustering_join_token
- clustering_description
- server_trusted_proxy
- clustering_update_cert
- storage_api_project
- server_instance_driver_operational
- server_supported_storage_drivers
- event_lifecycle_requestor_address
- resources_gpu_usb
- clustering_evacuation
- network_ovn_nat_address
- network_bgp
- network_forward
- custom_volume_refresh
- network_counters_errors_dropped
- metrics
- image_source_project
- clustering_config
- network_peer
- linux_sysctl
- network_dns
- ovn_nic_acceleration
- certificate_self_renewal
- instance_project_move
- storage_volume_project_move
- cloud_init
- network_dns_nat
- database_leader
- instance_all_projects
- clustering_groups
- ceph_rbd_du
- instance_get_full
- qemu_metrics
- gpu_mig_uuid
- event_project
- clustering_evacuation_live
- instance_allow_inconsistent_copy
- network_state_ovn
- storage_volume_api_filtering
- image_restrictions
- storage_zfs_export
- network_dns_records
- storage_zfs_reserve_space
- network_acl_log
- storage_zfs_blocksize
- metrics_cpu_seconds
- instance_snapshot_never
- certificate_token
- instance_nic_routed_neighbor_probe
- event_hub
- agent_nic_config
- projects_restricted_intercept
- metrics_authentication
- images_target_project
- images_all_projects
- cluster_migration_inconsistent_copy
- cluster_ovn_chassis
- container_syscall_intercept_sched_setscheduler
- storage_lvm_thinpool_metadata_size
- storage_volume_state_total
- instance_file_head
- instances_nic_host_name
- image_copy_profile
- container_syscall_intercept_sysinfo
- clustering_evacuation_mode
- resources_pci_vpd
- qemu_raw_conf
- storage_cephfs_fscache
- network_load_balancer
- vsock_api
- instance_ready_state
- network_bgp_holdtime
- storage_volumes_all_projects
- metrics_memory_oom_total
- storage_buckets
- storage_buckets_create_credentials
- metrics_cpu_effective_total
- projects_networks_restricted_access
- storage_buckets_local
- loki
- acme
- internal_metrics
- cluster_join_token_expiry
- remote_token_expiry
- init_preseed
- storage_volumes_created_at
- cpu_hotplug
- projects_networks_zones
- network_txqueuelen
- cluster_member_state
- instances_placement_scriptlet
- storage_pool_source_wipe
- zfs_block_mode
- instance_generation_id
- disk_io_cache
- amd_sev
- storage_pool_loop_resize
- migration_vm_live
- ovn_nic_nesting
- oidc
- network_ovn_l3only
- ovn_nic_acceleration_vdpa
- cluster_healing
- instances_state_total
- auth_user
- security_csm
- instances_rebuild
- numa_cpu_placement
- custom_volume_iso
- network_allocations
- zfs_delegate
- storage_api_remote_volume_snapshot_copy
- operations_get_query_all_projects
- metadata_configuration
- syslog_socket
- event_lifecycle_name_and_project
- instances_nic_limits_priority
- disk_initial_volume_configuration
- operation_wait
- image_restriction_privileged
- cluster_internal_custom_volume_copy
- disk_io_bus
- storage_cephfs_create_missing
- instance_move_config
- ovn_ssl_config
- certificate_description
- disk_io_bus_virtio_blk
- loki_config_instance
- instance_create_start
- clustering_evacuation_stop_options
- boot_host_shutdown_action
- agent_config_drive
- network_state_ovn_lr
- image_template_permissions
- storage_bucket_backup
- storage_lvm_cluster
- shared_custom_block_volumes
- auth_tls_jwt
- oidc_claim
- device_usb_serial
- numa_cpu_balanced
- image_restriction_nesting
- network_integrations
api_status: stable
api_version: "1.0"
auth: trusted
public: false
auth_methods:
- tls
auth_user_name: beheer
auth_user_method: unix
environment:
  addresses:
  - 10.0.0.45:8443
  - '[REDACTED]:8443'
  - '[REDACTED]:8443'
  - 10.10.66.1:8443
  - '[fddb:5833:c3ee::1]:8443'
  - 10.195.49.1:8443
  - '[fd42:a6f2:9167:eda0::1]:8443'
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    REDACTED
    -----END CERTIFICATE-----
  certificate_fingerprint: REDACTED
  driver: lxc | qemu
  driver_version: 5.0.3 | 8.2.2
  firewall: nftables
  kernel: Linux
  kernel_architecture: x86_64
  kernel_features:
    idmapped_mounts: "true"
    netnsid_getifaddrs: "true"
    seccomp_listener: "true"
    seccomp_listener_continue: "true"
    uevent_injection: "true"
    unpriv_binfmt: "false"
    unpriv_fscaps: "true"
  kernel_version: 6.1.0-17-amd64
  lxc_features:
    cgroup2: "true"
    core_scheduling: "true"
    devpts_fd: "true"
    idmapped_mounts_v2: "true"
    mount_injection_file: "true"
    network_gateway_device_route: "true"
    network_ipvlan: "true"
    network_l2proxy: "true"
    network_phys_macvlan_mtu: "true"
    network_veth_router: "true"
    pidfd: "true"
    seccomp_allow_deny_syntax: "true"
    seccomp_notify: "true"
    seccomp_proxy_send_notify_fd: "true"
  os_name: Debian GNU/Linux
  os_version: "12"
  project: default
  server: incus
  server_clustered: false
  server_event_mode: full-mesh
  server_name: j
  server_pid: 3966034
  server_version: "0.7"
  storage: btrfs
  storage_version: "6.2"
  storage_supported_drivers:
  - name: btrfs
    version: "6.2"
    remote: false
  - name: dir
    version: "1"
    remote: false

Issue description

(I'm using alias lxc = incus)

Adding a remote and then executing lxc list or lxc exec from local to the remote works properly.

lxc remote add j 10.0.0.45

lxc list j: 
lxc exec j:summary-mole bash

This also does seem to execute, but at the end returns

lxc copy av j:av
Error: Failed instance creation: Error transferring instance data: Failed migration on target: Error reading migration control source: local error: tls: bad record MAC

Steps to reproduce

  1. Add a remote
  2. Try to copy a container

Information to attach

I think the error might either be related to the differing processor architectures, or are related to the fact that the machines, which are run on Oracle cloud free tier, are doing something strange with the network interface MAC.

On both machines, br0 provides networking for the containers.

Network config local machine

sudo cat  /etc/systemd/network/br0.netdev
[NetDev]
Name=br0
Kind=bridge
sudo cat  /etc/systemd/network/br0.netdev
[Match]
Name=br0

[Network]
Description=Container networking bridge
Address=10.10.88.1/24
Address=fdcc:b486:4ec1::1/64
IPMasquerade=both

ConfigureWithoutCarrier=true
ActivationPolicy=always-up

enp0s6 is the primary network interface with DHCP IP 10.0.0.136

sudo cat  /etc/systemd/network/enp0s6.netdev
[Match]
Name=enp0s6

[Network]
DHCP=yes

Network config remote

sudo cat  /etc/systemd/network/br0.netdev
[NetDev]
Name=br0
SkipForwardingDelay=true
Kind=bridge
sudo cat  /etc/systemd/network/br0.network
Name=br0

[Network]
Description=Container networking bridge
Address=10.10.66.1/24
Address=fddb:5833:c3ee::1/64
IPMasquerade=both

ConfigureWithoutCarrier=true
ActivationPolicy=always-up
sudo cat  /etc/systemd/network/ens3.network

ens3 is the primary network interface with DHCP IP 10.0.0.45

[Match]
Name=ens3

[Network]
DHCP=yes
@phol phol changed the title Copy from amd64 local machine to arm64 remote and vice versa fails with "TLS bad record MAC" error on Oracle cloud Copy from arm64 local machine to amd64 remote and vice versa fails with "TLS bad record MAC" error on Oracle cloud Mar 27, 2024
@stgraber
Copy link
Member

Can you try your copy again but with --mode=relay?

@stgraber
Copy link
Member

When you run lxc copy av j:av, it will have your local Incus generate a token which is then sent by the CLI to the j host, instructing that host to directly connect to your local system to retrieve the instance.

The fact that the target connects to the source is why you often will have no problem interacting with both servers yourself but a copy may fail due to network issues going the other way.

--mode=relay has the CLI tool itself connect to both source and target and relay the data, so that should usually work fine if your CLI was already able to interact with both servers.

You also have the option of using --mode=push which instead has the source server push directly to the target server.

@stgraber stgraber added the Incomplete Waiting on more information from reporter label Mar 27, 2024
@phol
Copy link
Author

phol commented Mar 27, 2024

Hi @stgraber and thanks for replying so quickly.

With

lxc copy av j:av --mode=relay

I'm getting

Error: Error transferring instance data: Failed migration on target: Error reading migration control source: websocket: close 1006 (abnormal closure): unexpected EOF

During the copy, I do observe a btrfs receive command in htop. However, after about 1m30s, I get the error. I tried this multiple times. During the copy, I can also observe a av instance on j when running lxc ls.

with

lxc copy av j:av --mode=push 

I'm getting this message:

Transferring instance: av: 1.32GB (18.71MB/s) 

The "transferring instance" is not shown when I specify --mode=relay or nothing at all.

However, it unfortunately also errors out. With all three modes, this happens after around 1m30s.

time lxc copy av j:av --mode=push 
Error: Failed instance migration: Failed migration on source: migration dump failed
(00.011761) Error (criu/namespaces.c:460): Can't dump nested uts namespace for 4738
(00.011764) Error (criu/namespaces.c:721): Can't make utsns id
(00.018309) Error (criu/util.c:642): exited, status=1
(00.021460) Error (criu/util.c:642): exited, status=1
(00.022562) Error (criu/cr-dump.c:2098): Dumping FAILED.
incus copy av j:av --mode=push  0.09s user 0.02s system 0% cpu 1:24.65 total

@stgraber
Copy link
Member

Ah, please stop the container first or pass --stateless as live migrations of containers are very unlikely to succeed.

@phol
Copy link
Author

phol commented Mar 27, 2024

Alright, I'm running this now. Will update you when it completes.

time lxc copy av j:av --mode=push --stateless

@phol
Copy link
Author

phol commented Mar 27, 2024

I tried the commands.

lxc copy av j:av --mode=push --stateless

Works.

lxc copy av j:av --mode=relay --stateless
lxc copy av j:av --stateless

Work as well but don't show progress.
Do you have any idea why this might happen?

Also, about the --stateless flag: Would it perhaps be possible to force the --stateless flag when copying across containers between hosts with differing processor architectures?

@stgraber stgraber added Bug Confirmed to be a bug Easy Good for new contributors and removed Incomplete Waiting on more information from reporter labels Mar 27, 2024
@stgraber stgraber added this to the incus-6.0 milestone Mar 27, 2024
@stgraber stgraber changed the title Copy from arm64 local machine to amd64 remote and vice versa fails with "TLS bad record MAC" error on Oracle cloud Make sure progress and errors on instance copy is identical regardless of mode Mar 27, 2024
@stgraber stgraber self-assigned this Mar 27, 2024
@stgraber
Copy link
Member

Hmm, I've been unable to reproduce the migration error issue here:

root@v1:~# incus copy u1 v2:u1
Error: Failed instance creation: Error transferring instance data: Failed migration on target: Error from migration control source: Failed migration on source: migration dump failed
(00.003977) Error (criu/namespaces.c:460): Can't dump nested uts namespace for 3850
(00.003979) Error (criu/namespaces.c:721): Can't make utsns id
(00.007038) Error (criu/util.c:642): exited, status=1
(00.007842) Error (criu/util.c:642): exited, status=1
(00.008333) Error (criu/cr-dump.c:2098): Dumping FAILED.
root@v1:~# incus copy u1 v2:u1 --mode=relay
Error: Error transferring instance data: Failed migration on target: Error from migration control source: Failed migration on source: migration dump failed
(00.003417) Error (criu/namespaces.c:460): Can't dump nested uts namespace for 3850
(00.003418) Error (criu/namespaces.c:721): Can't make utsns id
(00.005060) Error (criu/util.c:642): exited, status=1
(00.005759) Error (criu/util.c:642): exited, status=1
(00.006247) Error (criu/cr-dump.c:2098): Dumping FAILED.
root@v1:~# incus copy u1 v2:u1 --mode=push
Error: Failed instance migration: Failed migration on source: migration dump failed
(00.003274) Error (criu/namespaces.c:460): Can't dump nested uts namespace for 3850
(00.003276) Error (criu/namespaces.c:721): Can't make utsns id
(00.005401) Error (criu/util.c:642): exited, status=1
(00.006276) Error (criu/util.c:642): exited, status=1
(00.006805) Error (criu/cr-dump.c:2098): Dumping FAILED.
root@v1:~# 

Then with stateless:

root@v1:~# incus copy u1 v2:u1 --stateless && incus delete -f v2:u1
root@v1:~# incus copy u1 v2:u1 --stateless --mode=relay && incus delete -f v2:u1
root@v1:~# incus copy u1 v2:u1 --stateless --mode=push && incus delete -f v2:u1

All of those showed transfer progress information as expected.

In my case, this was with Incus 0.7 on source and target server, CLI is Incus 0.7 too and storage on both source and target is basic dir backend.

@stgraber
Copy link
Member

Moving the issue back to Incomplete and un-milestone until I can reproduce an issue with either the error handling or transfer progress.

@stgraber stgraber removed Bug Confirmed to be a bug Easy Good for new contributors labels Mar 27, 2024
@stgraber stgraber removed this from the incus-6.0 milestone Mar 27, 2024
@phol
Copy link
Author

phol commented Mar 27, 2024

I was running both machines with incus 0.7, but with the btrfs storage backend on both.

@phol
Copy link
Author

phol commented Mar 27, 2024

Can I be of any help by e.g. retrying with debugging flags or something like that?

@stgraber
Copy link
Member

I'll retry with btrfs see if that makes it behave differently here.

@stgraber
Copy link
Member

Errors still propagate correctly but the progress information is indeed missing, so that's a btrfs driver issue then.

@stgraber stgraber added Bug Confirmed to be a bug Easy Good for new contributors labels Mar 27, 2024
@stgraber stgraber added this to the incus-6.0 milestone Mar 27, 2024
@stgraber stgraber changed the title Make sure progress and errors on instance copy is identical regardless of mode btrfs doesn't report transfer progress on both sides of the migration Mar 27, 2024
@phol
Copy link
Author

phol commented Mar 27, 2024

Alright, great. I'm happy to hear I was able to report a bug which helps to improve the project and I wasn't wasting your time.
Also, wow, I don't think I've ever seen a bug to be resolved as quickly as this one!

Thank you for all your efforts and good night :)

@hi-ko
Copy link

hi-ko commented Apr 4, 2024

I don't think it's specific to the btrfs driver only. This affects also the zfs driver: using --mode pull --refresh does not show any progress when running on the target host.

I also get the error mentioned in the issue description on some containers when copying over WAN network thru a tunnel:

Error: Failed instance creation: Error transferring instance data: Failed migration on target: Error reading migration control source: local error: tls: bad record MAC

rerunning the command later may be successfull.

full command used:

incus copy $host:$container $container --mode pull --refresh --project $project --target-project=$project

on source target server runs: incus 6.0.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed to be a bug Easy Good for new contributors
Development

No branches or pull requests

3 participants