Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info

windwos bsr 1.6.1 설치 및 1:1 연결의 주요 동작 로그로 작성되었습니다.

  • 작성 된 주요 동작 로그는 성공 시 출력되는 로그이며 실패 시 다른 로그가 출력됩니다.

동작

커널 로드

  • windows 에서 bsr 커널 드라이버는 시스템 시작 시 커널에 로드되며 로드 시 출력 되는 로그입니다.

    • bsr 드라이버 버전과 bsr 드라이버에서 사용하는 port 정보등을 출력합니다.

    • 시스템 로그가 없는 상황에서는 커널 로드 로그 출력 시간을 통해 시스템 시작 시간을 짐작하기도 합니다.

...

  • bsradm.exe, bsrsetup.exe, bsrmeta.exe, bsrcon.exe 으로 명령 실행 시 bsr 커널에서 실행되는 명령의 경우 bsr 커널 명령 실행 전 실행 후 로그를 출력합니다.

    • 명령 실행 시 “bsr netlink cmd(~) begin”, 명령 실행 후 “bsr netlink cmd(~) done” 로그를 출력하며 실행되는 명령은 cmd()안에 표시됩니다.

      • 예를들어 명령 실행 시 cmd“cmd(~) begin begin” 은 있으나 매칭되는 cmd“cmd(~) done done” 이 없다면 명령이 완료되지 않은 상황임을 알 수 있습니다.

    • 명령에따라 실행 시 완료까지 대기하거고 완료되거나 대기하지 않고 완료될 수 있습니다.

    • 명령 실행으로 출력되는 로그는 ..으로 작성했으며 다른 주요 동작 로그에서 작성하고 설명합니다.

Code Block
00000018 10/13/2020 16:09:29.0058164 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CCAA5040] bsr netlink cmd(BSR_ADM_NEW_RESOURCE:7) begin ->
..
00000020 10/13/2020 16:09:29.0612210 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CCAA5040] bsr netlink cmd(BSR_ADM_NEW_RESOURCE:7) done (cmd_pending:0) <-
..
00000021 10/13/2020 16:09:29.0902772 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CCAC4880] bsr netlink cmd(BSR_ADM_NEW_MINOR:5) begin ->
..
00000023 10/13/2020 16:09:29.0906385 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CCAC4880] bsr netlink cmd(BSR_ADM_NEW_MINOR:5) done (cmd_pending:0) <-
..
00000024 10/13/2020 16:09:29.1011907 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CCAA5040] bsr netlink cmd(BSR_ADM_NEW_PEER:44) begin ->
..
00000027 10/13/2020 16:09:29.1014684 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CCAA5040] bsr netlink cmd(BSR_ADM_NEW_PEER:44) done (cmd_pending:0) <-
..
00000028 10/13/2020 16:09:29.1124180 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CC6EF880] bsr netlink cmd(BSR_ADM_NEW_PATH:45) begin ->
..
00000029 10/13/2020 16:09:29.1140515 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CC6EF880] bsr netlink cmd(BSR_ADM_NEW_PATH:45) done (cmd_pending:0) <-

상태 변경

  • bsr 커널은 disk, peer disk, replication, connection, role 상태 변경 시 로그를 출력합니다.

    • 복수의 상태 변경이 하나의 로그에 출력 될 수 있습니다.

    • 예상하지 못한 상태 설정 시 로그를 통해 문제 등을 유추 할 수 있습니다.

      • 예를 들어 연결 상태가 connected 에서 connecting 또는 standalone 으로 변경된다면 연결 상태 변경 로그 중심으로 원인을 분석 할 수 있습니다.

      • 로그에는 상태 변경을 진행한 함수 이름이 같이 출력되며 소스코드 확인이 가능하다면 더 상세한 분석을 할 수 있습니다.

        • 다음 로그에서 상태 변경을 진행한 함수 이름은 “change_disk_state, recevie_state, change_role, connect_work” 입니다.

Code Block
00000053 10/13/2020 16:09:29.3583295 [print_state_change] [STATE:20] bsr_info<6> bsr r0/0 minor 2, ds(Attaching), dvflag(0x0): change_disk_state, disk( Attaching -> UpToDate )
00000091 10/13/2020 16:09:48.2615732 [print_state_change] [STATE:21] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x100200): receive_state, pdsk( DUnknown -> UpToDate ) repl( Off -> Established )
00000099 10/13/2020 16:10:14.3457504 [print_state_change] [STATE:18] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0xa1c): change_role, role( Secondary -> Primary )
00000583 10/13/2020 16:14:01.5270141 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0xa0a): connect_work, conn( Connecting -> Connected ) peer( Unknown -> Secondary )

노드 연결

Code Block
00000791 10/13/2020 16:18:20.9761914 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CCAFB880] bsr netlink cmd(BSR_ADM_DISCONNECT:11) begin ->
00000792 10/13/2020 16:18:20.9762023 [state_change_lock] [STATE:40] bsr_info<6> [0xFFFFE000CCAFB880] worker should not initiate state changes with CS_SERIALIZE current:FFFFF801707C9BD0 resource->worker.task:FFFFE000CDE0B820
00000793 10/13/2020 16:18:20.9762218 [change_cluster_wide_state] [TWOPC:34] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x28): Preparing cluster-wide state change 2366568536 (0->1 496/16)
00000794 10/13/2020 16:18:20.9762252 [change_cluster_wide_state] [TWOPC:35] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x28): [TWOPC:2366568536] target_node_id(1) conn(Disconnecting) repl(-) disk(-) pdsk(-) role(-) peer(-) flags (8) 
00000795 10/13/2020 16:18:20.9766356 [change_cluster_wide_state] [TWOPC:36] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x28): [TWOPC:2366568536] target_node_id(1) get_cluster_wide_reply (3) 
00000796 10/13/2020 16:18:20.9766412 [change_cluster_wide_state] [TWOPC:37] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x28): State change 2366568536: primary_nodes=0, weak_nodes=0
00000797 10/13/2020 16:18:20.9766463 [change_cluster_wide_state] [TWOPC:38] bsr_info<6> bsr r0 pnode-id:1, cs(Connected), prole(Secondary), cflag(0x10e), scf(0x28): Cluster is now split
00000798 10/13/2020 16:18:20.9766497 [change_cluster_wide_state] [TWOPC:39] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x28): Committing cluster-wide state change 2366568536 (0ms) (0->1)
00000799 10/13/2020 16:18:20.9767770 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(Connected), prole(Secondary), cflag(0x10e), scf(0x208): conn_try_disconnect, conn( Connected -> Disconnecting ) peer( Secondary -> Unknown )
00000800 10/13/2020 16:18:20.9767831 [print_state_change] [STATE:21] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(UpToDate), prpl(Established), pdvflag(0x100600): conn_try_disconnect, pdsk( UpToDate -> DUnknown ) repl( Established -> Off )
00000801 10/13/2020 16:18:20.9768211 [bsr_recv] [SOCKET:15] bsr_info<6> bsr r0 pnode-id:1, cs(Connected), prole(Secondary), cflag(0x10e), scf(0x208): socket data receive error(-15)
00000802 10/13/2020 16:18:20.9768974 [bsr_ack_receiver] [CONNECTION:16] bsr_info<6> bsr r0 pnode-id:1, cs(Disconnecting), prole(Unknown), cflag(0x10e), scf(0x0): ack_receiver terminated
00000803 10/13/2020 16:18:20.9769028 [bsr_thread_setup] [THREAD:9] bsr_info<6> bsr r0 pnode-id:1, cs(Disconnecting), prole(Unknown), cflag(0x10e), scf(0x0): Terminating ack_recv thread
00000804 10/13/2020 16:18:20.9771990 [w_resync_timer] [RESYNC/OV:105] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x0): completed because it is not in the VerifyS or SyncTarget replication state.
00000805 10/13/2020 16:18:20.9774397 [conn_disconnect] [CONNECTION:9] bsr_info<6> bsr r0 pnode-id:1, cs(Disconnecting), prole(Unknown), cflag(0x10e), scf(0x0): Connection closed
00000806 10/13/2020 16:18:20.9774928 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(Disconnecting), prole(Unknown), cflag(0x10e), scf(0x23): conn_disconnect, conn( Disconnecting -> StandAlone )
00000807 10/13/2020 16:18:20.9775335 [bsr_thread_setup] [THREAD:9] bsr_info<6> bsr r0 pnode-id:1, cs(StandAlone), prole(Unknown), cflag(0x10e), scf(0x0): Terminating receiver thread
00000808 10/13/2020 16:18:20.9777552 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CCAFB880] bsr netlink cmd(BSR_ADM_DISCONNECT:11) done (cmd_pending:0) <-
00000809 10/13/2020 16:18:22.9478253 [NetlinkWorkThread] [NETLINK:18] bsr_info<6> [0xFFFFE000CCAAC040] bsr netlink cmd(BSR_ADM_CONNECT:10) begin ->
00000810 10/13/2020 16:18:22.9478355 [state_change_lock] [STATE:40] bsr_info<6> [0xFFFFE000CCAAC040] worker should not initiate state changes with CS_SERIALIZE current:FFFFF801707C9BD0 resource->worker.task:FFFFE000CDE0B820
00000811 10/13/2020 16:18:22.9478519 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(StandAlone), prole(Unknown), cflag(0x10e), scf(0x2a): bsr_adm_connect, conn( StandAlone -> Unconnected )
00000812 10/13/2020 16:18:22.9478904 [bsr_thread_start] [THREAD:11] bsr_info<6> bsr r0 pnode-id:1, cs(Unconnected), prole(Unknown), cflag(0x10e), scf(0x0): Starting receiver thread (from worker [2108])
00000813 10/13/2020 16:18:22.9480150 [NetlinkWorkThread] [NETLINK:20] bsr_info<6> [0xFFFFE000CCAAC040] bsr netlink cmd(BSR_ADM_CONNECT:10) done (cmd_pending:0) <-
00000814 10/13/2020 16:18:22.9480265 [state_change_lock] [STATE:40] bsr_info<6> [0xFFFFE000CE728040] worker should not initiate state changes with CS_SERIALIZE current:FFFFE000CE0F9170 resource->worker.task:FFFFE000CDE0B820
00000815 10/13/2020 16:18:22.9480416 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(Unconnected), prole(Unknown), cflag(0xe), scf(0x2a): conn_connect, conn( Unconnected -> Connecting )
00000816 10/13/2020 16:18:25.9566941 [bsr_do_features] [PROTOCOL:55] bsr_info<6> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0x0): Handshake to peer 1 successful: Agreed network protocol version 114
00000817 10/13/2020 16:18:25.9567017 [bsr_do_features] [PROTOCOL:56] bsr_info<6> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0x0): Feature flags enabled on protocol level: 0x7 TRIM THIN_RESYNC WRITE_SAME.
00000818 10/13/2020 16:18:25.9568141 [conn_connect] [SEND BUFFER:27] bsr_warn<4> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0x0): send-buffering disabled nc->sndbuf_size:0
00000819 10/13/2020 16:18:25.9568180 [bsr_thread_start] [THREAD:11] bsr_info<6> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0x0): Starting ack_recv thread (from receiver [1676])
00000820 10/13/2020 16:18:26.0424367 [state_change_lock] [STATE:40] bsr_info<6> [0xFFFFE000CDA1A040] worker should not initiate state changes with CS_SERIALIZE current:FFFFE000CDAA58C0 resource->worker.task:FFFFE000CDE0B820
00000821 10/13/2020 16:18:26.0424586 [change_cluster_wide_state] [TWOPC:34] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x82a): Preparing cluster-wide state change 2387348015 (0->1 499/146)
00000822 10/13/2020 16:18:26.0424733 [change_cluster_wide_state] [TWOPC:35] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x82a): [TWOPC:2387348015] target_node_id(1) conn(Connected) repl(-) disk(-) pdsk(-) role(Secondary) peer(-) flags (2058) 
00000823 10/13/2020 16:18:26.0430489 [change_cluster_wide_state] [TWOPC:36] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x82a): [TWOPC:2387348015] target_node_id(1) get_cluster_wide_reply (3) 
00000824 10/13/2020 16:18:26.0430560 [change_cluster_wide_state] [TWOPC:37] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x82a): State change 2387348015: primary_nodes=0, weak_nodes=0
00000825 10/13/2020 16:18:26.0430611 [change_cluster_wide_state] [TWOPC:39] bsr_info<6> bsr r0, r(Secondary), f(0x0), scf(0x82a): Committing cluster-wide state change 2387348015 (0ms) (0->1)
00000826 10/13/2020 16:18:26.0432317 [print_state_change] [STATE:19] bsr_info<6> bsr r0 pnode-id:1, cs(Connecting), prole(Unknown), cflag(0xe), scf(0xa0a): connect_work, conn( Connecting -> Connected ) peer( Unknown -> Secondary )
00000827 10/13/2020 16:18:26.0434960 [receive_SyncParam] [PROTOCOL:29] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x0): sync, resync_rate : 250k, c_plan_ahead : 20k, c_delay_target : 10k, c_fill_target : 100s, c_max_rate : 102400k, c_min_rate : 250k, ov_req_num : 10b, ov_req_interval : 100ms
00000828 10/13/2020 16:18:26.0435080 [receive_sizes] [PROTOCOL:32] bsr_info<6> bsr r0/0 minor 2, ds(UpToDate), dvflag(0x0): current_mydisk_size: 10860101632 bytes
00000829 10/13/2020 16:18:26.0435114 [receive_sizes] [PROTOCOL:33] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x0): peer_current_size: 10860101632 bytes peer_user_size: 0 bytes peer_disk_size: 10860101632 bytes peer_max_size: 10860101632 bytes
00000830 10/13/2020 16:18:26.0435168 [receive_sizes] [PROTOCOL:34] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x100000): md_effective_size: 21211136 my_user_size: 0 my_max_size: 21211136
00000831 10/13/2020 16:18:26.0435209 [get_max_agreeable_size] [GENL:16] bsr_info<6> bsr r0/0 minor 2, ds(UpToDate), dvflag(0x0): local node_id: 0
00000832 10/13/2020 16:18:26.0435236 [get_max_agreeable_size] [GENL:17] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x100000): node id(1) bitmap index(0) bitmap uuid(0x0) flags(0x10) max size(21211136) disk state(DUnknown)
00000833 10/13/2020 16:18:26.0435292 [receive_sizes] [PROTOCOL:39] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x100000): Proceed with device size setting
00000834 10/13/2020 16:18:26.0436443 [get_max_agreeable_size] [GENL:16] bsr_info<6> bsr r0/0 minor 2, ds(UpToDate), dvflag(0x0): local node_id: 0
00000835 10/13/2020 16:18:26.0436474 [get_max_agreeable_size] [GENL:17] bsr_info<6> bsr r0/0 minor 2 pnode-id:1, pdsk(DUnknown), prpl(Off), pdvflag(0x100000): node id(1) bitmap index(0) bitmap uuid(0x0) flags(0x10) max size(21211136) disk state(DUnknown)

...