Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Overview

bsrmon is a performance monitoring tool that tracks the entire replication process of a BSR engine performing replication and records the time taken by each section of the engine's logic in real time to identify bottlenecks in the engine's behavior. The BSR kernel engine maintains performance-related statistics in the form of file logs, including a section-by-section time history of when the engine was running through internal time (jiffies) records and cumulative calculations. Administrators can monitor these statistical information of the kernel engine by querying with the bsrmon utility. bsrmon is supported from version 1.6.1.

In a linux environment, the Performance Monitor feature is only available in kernels with the debugfs feature enabled, and debugfs must be mounted in the /sys/kernel/debug path.

You can check if the debugfs feature is enabled using the following command.

  • # grep CONFIG_DEBUG_FS /boot/config-`uname -r`
    CONFIG_DEBUG_FS=y

Since debugfs is automatically mounted starting with CentOS7, and not automatically mounted in CentOS6, you can use the performance monitor feature after mounting it in the /sys/kernel/debug path. If you want to use the performance monitor feature even after a system reboot, you need to register that mount path in fstab.

  • # mkdir /sys/kernel/debug
    # mount -t debugfs debugfs /sys/kernel/debug 

Types

The target types of the performance measurement. Corresponds to the types argument of bsrmon.

iostat, ioclat, reqstat, peer_reqstat, alstat, network, sendbuf, memstat

iostat {resource} {vnr}

Aggregate data for read and write I/O on the replica volume (io count, kbs, kb)

 Usage examples
  • Real-time monitoring of I/O performance data

> bsrmon /watch iostat r0 0
2020-12-06_08:22:36.223
  read : IO count=0, BW=0kb/s (0KB)
  write: IO count=216, BW=110208kb/s (220416KB)
2020-12-06_08:22:37.640
  read : IO count=0, BW=0kb/s (0KB)
  write: IO count=34, BW=16896kb/s (16896KB)
2020-12-06_08:22:39.32
  read : IO count=0, BW=0kb/s (0KB)
  write: IO count=326, BW=155176kb/s (155176KB)
2020-12-06_08:22:40.106
  read : IO count=0, BW=0kb/s (0KB)
  write: IO count=330, BW=164624kb/s (164624KB)
  ...

→ Situations that write I/O is constantly occurring

  • Statistical output of I/O performance data

> bsrmon /report iostat r0 0
Report r0 [IO STAT - vnr0]
 Run: 2021-07-20_08:14:27.405 - 2021-08-04_18:59:53.573
  read : io count=24855, bw=6325194kbyte
    BW (kbyte/s): min=40, max=284928, avg=248578, samples=24
  write: io count=8360345, bw=2139182512kbyte
    BW (kbyte/s): min=12, max=401408, avg=320256, samples=6671

→ Data aggregated for the period 2021-07-20_08:14:27.405 - 2021-08-04_18:59:53.573

→ 24 read I/O samples. 248 average, 40 minimum, 284 maximum Mbytes/s of IO processed.

→ 6670 write I/O samples. 320 average, 12 minimum, 401 maximum Mbytes/s of IO processed.

  • Calculate statistics using backup files

> bsrmon /report iostat r0 0 /f vnr0_IO_STAT_2021-07-20_081427.005 
Report r0 [vnr0_IO_STAT_2021-07-20_081427.005]
 Run: 2021-07-07_21:24:24.383 - 2021-07-20_08:14:26.371
  read : ios=6332219, bw=1595360917kbyte
    IOPS        : min=1, max=1233, avg=980, samples=6452
    BW (kbyte/s): min=4, max=428032, avg=247033, samples=6452
  write: ios=22550577, bw=3961426883kbyte
    IOPS        : min=1, max=6478, avg=786, samples=28468
    BW (kbyte/s): min=4, max=1085440, avg=138461, samples=28484

→ Extract data from the file vnr0_IO_STAT_2021-07-20_081427.005

→ Data aggregated for the period 2021-07-07_21:24:24.383 - 2021-07-20_08:14:26.371

→ 6452 read I/O samples. Measured with an average of 980 IOPS, a minimum of 1 IOPS, and a maximum of 1233 IOPS.

→ 28468 write I/O samples. Measured with an average of 786 IOPS, a minimum of 1 IOPS, and a maximum of 6478 IOPS.

ioclat {resource} {vnr}

local/master I/O 의 complete latency (usec)

 Usage examples
  • Real-time monitoring of I/O complete latency

> bsrmon /watch ioclat r0 0
2020-12-08_09:33:19.171
  local clat  (usec): min=520, max=112000, avg=10347
  master clat (usec): min=610, max=256852, avg=28202
2020-12-08_09:33:20.351
  local clat  (usec): min=780, max=478387, avg=44316
  master clat (usec): min=1499, max=492829, avg=114106
2020-12-08_09:33:21.509
  local clat  (usec): min=478, max=19805, avg=4523
  master clat (usec): min=577, max=24335, avg=6303
  ...

→ Situations where write I/O is constantly occurring

→ Output minimum, maximum, and average values for the time it took for local/master I/O to complete.

  • I/O complete latency statistics output

> bsrmon /report ioclat r0 0 
Report r0 [IO COMPLETE - vnr0]
 Run: 2020-12-07_02:42:31.440 - 2020-12-08_09:37:44.115
  local clat  (usec): min=153, max=1635570, avg=8601, samples=50
  master clat (usec): min=217, max=1667260, avg=21801, samples=50

→ Data aggregated for the period 2020-12-07_02:42:31.440 - 2020-12-08_09:37:44.115

→ 50 local clat samples. measured to have taken an average of 8601, a minimum of 153, and a maximum of 1635570 usec.

→ 50 master clat samples. measured to have taken an average of 21801, a minimum of 217, and a maximum of 1667260 usecs.

reqstat {resource} {vnr}

Aggregate data on the time taken to reach a specific segment based on when the request was created (usec)

  • requests : Number of requests processed during the monitoring period (cycles)

  • before_queue : Time spent before submitting work queued up

  • before_al_begin : Time spent before getting LRU of ACT_LOG

  • in_actlog : Time spent before RQ_IN_ACT_LOG was set.

  • submit : Time taken for request submit to be executed

  • bio_endio: Time taken for bio end logic to be executed

  • pre_send: Time spent before sending data to peer

  • acked: Time spent before RQ_NET_PENDING was removed after sending.

  • net_done: Time spent until RQ_NET_DONE

  • destroy : Time taken until request is released

Time for each leg of the AL UPTATE to complete (usec)

  • al_update : Number of active logs updated during the monitoring period (cycle)

  • before_bm_write : Before bsr_bm_write_hinted() execution

  • after_bm_write : After execution of bsr_bm_write_hinted()

  • after_sync_page : after execution of bsr_md_sync_page_io()

 Usage examples
  • Real-time monitoring of request performance data

> bsrmon /watch reqstat r0 0
2020-12-08_09:42:18.549
  requests  : 445
    before_queue    (usec): min=0, max=19, avg=0
    before_al_begin (usec): min=0, max=4907, avg=11
    in_actlog       (usec): min=1, max=7191, avg=35
    submit          (usec): min=3, max=7215, avg=46
    bio_endio       (usec): min=4, max=7219, avg=50
    destroy         (usec): min=577, max=214980, avg=15711
  al_uptate  : 0
    before_bm_write (usec): min=0, max=0, avg=0
    after_bm_write  (usec): min=0, max=0, avg=0
    after_sync_page (usec): min=0, max=0, avg=0
  PEER bsr03:
    pre_send (usec): min=0, max=0, avg=0
    acked    (usec): min=0, max=0, avg=0
    net_done (usec): min=0, max=0, avg=0
  PEER 100.100.10.31:7792:
    pre_send (usec): min=571, max=214862, avg=15703
    acked    (usec): min=555, max=210004, avg=7966
    net_done (usec): min=575, max=214978, avg=15710
...

→ Write I/O is occurring continuously

→ Disconnected from peer bsr03

→ AL update is not performed

  • Output request performance data statistics

> bsrmon /report reqstat r0 0 
Report r0 [REQUEST STAT - vnr0]
 Run: 2020-12-06_05:46:36.678 - 2020-12-08_09:45:46.219
  requests : total=38180
    before_queue    (usec): min=0, max=0, avg=0, samples=0
    before_al_begin (usec): min=6966, max=460112, avg=4264, samples=2
    in_actlog       (usec): min=1, max=589946, avg=549, samples=28
    submit          (usec): min=2, max=721793, avg=349, samples=72
    bio_endio       (usec): min=3, max=589987, avg=579, samples=28
    destroy         (usec): min=83, max=2362749, avg=46380, samples=161
  al_uptate : total=0
    before_bm_write (usec): min=0, max=0, avg=0, samples=0
    after_bm_write  (usec): min=0, max=0, avg=0, samples=0
    after_sync_page (usec): min=0, max=0, avg=0, samples=0
  PEER 100.100.10.31:7792:
    pre_send (usec): min=7, max=1667187, avg=22364, samples=161
    acked    (usec): min=11, max=1635581, avg=15737, samples=161
    net_done (usec): min=82, max=2362749, avg=39406, samples=161
  PEER bsr03:
    pre_send (usec): min=0, max=0, avg=0, samples=0
    acked    (usec): min=0, max=0, avg=0, samples=0
    net_done (usec): min=0, max=0, avg=0, samples=0

→ 38180 requests were processed during the period 2020-12-06_05:46:36.678 - 2020-12-08_09:45:46.219

→ al_update never occurred

peer_reqstat {resource} {vnr}

Aggregate data on the time taken to reach a certain interval based on when the peer request was created (USEC)

  • peer_requests : Number of peer requests processed during the monitoring period (cycle)

  • submit : Time taken for a peer request submit to be executed

  • bio_endio: Time taken for the bio completion logic of a peer request to be executed.

  • destroy: Time taken for peer request to be destroyed

 Usage examples
  • Real-time monitoring of peer request performance data

> bsrmon /watch peer_reqstat r0 0 
2021-07-19_21:52:02.191
  PEER bsr-03:
    peer requests : 0
    submit    (usec): min=0, max=0, avg=0
    bio_endio (usec): min=0, max=0, avg=0
    destroy   (usec): min=0, max=0, avg=0
  PEER bsr-02:
    peer requests : 100
    submit    (usec): min=421, max=6907, avg=2184
    bio_endio (usec): min=1021, max=7312, avg=2563
    destroy   (usec): min=1739, max=7955, avg=3244

→ 100 peer requests for peer bsr-02 are processed

→ No peer requests for peer bsr-03 occurred

  • peer request performance statistics output

> bsrmon /report peer_reqstat r0 0
Report r0 [PEER REQUEST STAT - vnr0]
 Run: 2021-07-11_22:23:08.890 - 2021-07-19_21:56:18.179
  PEER bsr-02:
    peer requests : total=344054
    submit    (usec): min=1, max=36103, avg=99, samples=6902
    bio_endio (usec): min=1, max=116988, avg=96161, samples=9517
    destroy   (usec): min=47, max=117495, avg=96340, samples=9518
  PEER bsr-03:
    peer requests : total=133288
    submit    (usec): min=1, max=1670, avg=5, samples=6037
    bio_endio (usec): min=1, max=117000, avg=104660, samples=8708
    destroy   (usec): min=63, max=125871, avg=104839, samples=8709

→ Aggregated data for the period 2021-07-11_22:23:08.890 - 2021-07-19_21:56:18.179

→ BIO completion logic took an average of 0.1 seconds to execute

alstat {resource} {vnr}

Aggregate data for usage figures in active log

  • Aggregate used values, maximum value of used values

  • Aggregate incremental values of HITS, MISSES, STARVING, LOCKED, CHANGED

  • Count of AL_WAIT RETRY

  • Aggregation of AL shortage causes

    • starving, pending, used, busy, wouldblock

 Usage examples
  • Real-time monitoring of active log performance data

> bsrmon /watch alstat r0 0
2021-08-04_19:10:11.463
  used    :          3/67 (max=8)
  hits    :        579 (total=2843)
  misses  :       2250 (total=17564)
  starving:          0 (total=0)
  locked  :          0 (total=0)
  changed :        749 (total=5854)
  al_wait retry :          0 (total=0, max=0)
  pending_changes :  1/64
  error   : 0
    NOBUFS - starving : 0
           - pending slot : 0
           - used slot : 0
    BUSY : 0
    WOULDBLOCK : 0
  flags   : __LC_DIRTY __LC_LOCKED 

→ 3 AL slots in use at that time

→ 1 al slot is pending

→ dirty, locked flag is set

  • active log performance statistics output

> bsrmon /report alstat r0 0
Report r0 [AL STAT - vnr0]
 Run: 2021-08-03_04:48:33.721 - 2021-08-03_22:47:28.326
  al_extents : 6001
    used     : max=0(all_slot_used=0), avg=0
    hits     : total=0
    misses   : total=0
    starving : total=0
    locked   : total=0
    changed  : total=0
    al_wait retry count : max=0, total=0
    pending_changes     : max=0, total=0
    error : total=0
      NOBUFS - starving     : total=0
             - pending slot : total=0
             - used    slot : total=0
      BUSY       : total=0
      WOULDBLOCK : total=0
 -> al_extents changed 
 Run: 2021-08-03_22:47:52.895 - 2021-08-04_19:27:06.522
  al_extents : 67
    used     : max=67(all_slot_used=59), avg=3
    hits     : total=337528
    misses   : total=2020409
    starving : total=2
    locked   : total=0
    changed  : total=673370
    al_wait retry count : max=2, total=501
    pending_changes     : max=64, total=258
    error : total=1004
      NOBUFS - starving     : total=501
             - pending slot : total=0
             - used    slot : total=503
      BUSY       : total=0
      WOULDBLOCK : total=0

→ al_extents value changed from 6001 to 67 on 2021-08-03_22:47:52.895

→ When AL is set to 67, all AL slots are used 59 times, STARVING situation occurs 2 times, and AL_WAIT RETRY occurs 501 times

network {resource}

Replication network sending and receiving speed (byte/s)

 Usage examples
  • Real-time monitoring of network performance data

> bsrmon /watch network r0
2020-12-08_09:47:45.84
  PEER 100.100.10.31:7792:
    send (byte/s): 62932184
    recv (byte/s): 4820
  PEER bsr03:
    send (byte/s): 0
    recv (byte/s): 0
2020-12-08_09:47:46.160
  PEER 100.100.10.31:7792:
    send (byte/s): 104896568
    recv (byte/s): 12320
  PEER bsr03:
    send (byte/s): 0
    recv (byte/s): 0
...

→ peer bsr03 과는 연결 단절된 상태

→ peer 100.100.10.31:7792 로 데이터 전송이 진행되고 있는 중

  • network 성능 통계 출력

> bsrmon /report network r0
Report r0 [NETWORK SPEED]
 Run: 2020-11-30_06:38:44.653 - 2020-12-08_09:53:31.455
  PEER 100.100.10.31:7792: send=52497905byte/s, receive=5902byte/s
    send (byte/s): min=3, max=115395016, avg=52497905, samples=392
    recv (byte/s): min=3, max=15004, avg=5902, samples=367
  PEER bsr03: send=0byte/s, receive=0byte/s
    send (byte/s): min=0, max=0, avg=0, samples=0
    recv (byte/s): min=0, max=0, avg=0, samples=0

→ 평균적으로 52.4 MB 의 데이터가 peer 100.100.10.31 노드로 전송되고 있음

sendbuf {resource}

송신버퍼 사용량 (bytes)

 사용 예
  • 송신버퍼 성능 데이터 실시간 모니터링

> bsrmon /watch sendbuf r0
2020-12-08_09:54:13.735
  PEER 100.100.10.31:7792:
    highwater: 11246, fill: 37094400bytes
        ap_in_flight: 8881 (9094144bytes)
        rs_in_flight: 2365 (28000256bytes)
    data stream
        size (bytes): 10485760000
        used (bytes): 27098576
         [P_DATA]  -  cnt: 7542  size: 8024616bytes
         [P_RS_DATA_REPLY]  -  cnt: 1596  size: 18942304bytes
         [P_BARRIER]  -  cnt: 549  size: 13176bytes
         [P_UNPLUG_REMOTE]  -  cnt: 7406  size: 118496bytes
    control stream
        size (bytes): 5242880
        used (bytes): 0
  PEER bsr03:
    highwater: 0, fill: 0bytes
        ap_in_flight: 0 (0bytes)
        rs_in_flight: 0 (0bytes)
    data stream
        size (bytes): 10485761
        used (bytes): 0
    control stream
        size (bytes): 5242881
        used (bytes): 0
...

→ peer 100.100.10.31:7792, bsr03 에 송신버퍼가 할당되어 있음

→ peer bsr03 과는 연결 단절된 상태

→ peer 100.100.10.31:7792 로 데이터 전송이 진행되고 있는 중

→ peer 100.100.10.31:7792 에 계류중인 복제 및 동기화 데이터 갯수는 11246 개 (복제 데이터 8881개, 동기화 데이터 2365개)

  • 송신버퍼 성능 통계 출력

> bsrmon /report sendbuf r0
Report r0 [SEND BUFFER]
 Run: 2020-12-05_13:26:59.969 - 2020-12-08_09:56:33.718
  PEER 100.100.10.31:7792: data stream size=10485761byte, control stream size=5242881byte
    data-used (bytes): min=2097232, max=7603084, avg=4787174, samples=5
    cntl-used (bytes): min=0, max=0, avg=0, samples=0
    highwater: min=1, max=8014, avg=760, samples=999
  PEER bsr03: data stream size=10485761byte, control stream size=5242881byte
    data-used (bytes): min=0, max=0, avg=0, samples=0
    cntl-used (bytes): min=0, max=0, avg=0, samples=0

→ 평균적으로 10MB 중 4.7MB 의 버퍼가 사용되고 있음

memstat

유저 및 모듈에서 사용하는 메모리 사용량

유저 공간에서는 bsradm, bsrsetup, bsrcon, bsrmon, bsrservice 프로세스가 사용하는 메모리가 집계됩니다.

  • windows

    • GetProcessMemoryInfo()를 통해 획득한 메모리 정보

      • WorkingSetSize, QuotaPagedPoolUsage, QuotaNonPagedPoolUsage, PagefileUsage

  • linux

    • ps 명령을 통해 획득한 메모리 정보

      • rsz : 물리 메모리 사용량

      • vsz : 가상 메모리 사용량

커널 모듈에서 사용하는 메모리는 os 별로 다음과 같은 정보가 집계됩니다.

  • windows

    • 드라이버에서 'BS--' tag로 할당된 nonpaged, paged 메모리 사용량

  • linux

    • slab cache 정보

      • /sys/kernel/slab/bsr_req

      • /sys/kernel/slab/bsr_al

      • /sys/kernel/slab/bsr_bm

      • /sys/kernel/slab/bsr_ee

 사용 예
  • linux 메모리 사용량 실시간 모니터링

> bsrmon /watch memstat
2020-12-08_09:57:27.171
  module (bytes)
    BSR_REQ : 16334336
    BSR_AL  : 803760
    BSR_BM  : 4161536
    BSR_EE  : 2782560
  user (kbytes)
    name      pid    rsz        vsz       
    bsrmon    29304  1192       12724     
    bsrmon    37474  1200       12720     
    bsrmon    112177 1192       12724     
    bsrmon    113913 1068       12724     
    bsrmon    113978 1308       12728   

→ bsrmon 프로세스가 여러개 실행되고 있는 상태. 종료되지 않은 bsrmon 프로세스가 존재

  • 메모리 사용량 통계 출력

> bsrmon /report memstat
Report [MEMORY]
 Run: 2020-12-06_19:04:12.177 - 2020-12-08_09:59:17.716
 module (bytes)
  BSR_REQ: 16303104 - 16459264    
  BSR_AL : 803760                 
  BSR_BM : 4063232 - 5271552      
  BSR_EE : 2782560                
 user (kbytes)
  name          rsz                     vsz
  bsradm        0                       0                       
  bsrsetup      0                       0                       
  bsrmeta       0                       0                       
  bsrmon        1068 - 1368             12720 - 12732     

→ 2020-12-06_19:04:12.177 - 2020-12-08_09:59:17.716 기간동안 수집된 데이터의 메모리 사용량 범위 출력

→ BSR_REQ 는 16303104 ~ 16459264 bytes 사용됨

→ 유저 공간에서 bsrmon 이외에는 수집된 데이터가 없는 것으로 보아, cli 명령들이 1초 이내에 수행 완료 되었음을 알 수 있다.

resync_ratio {resource} {vnr}

실시간 복제, 동기화 전송량과 이에 대한 동기화 전송 비율

동기화가 진행 중일때 소스 노드에서만 갱신됩니다.

 사용 예
> bsrmon /watch resync_ratio r0 0
2022-04-12_15:34:52.206
svr06
    replcation(144100kb)/resync(18508kb),  resync ratio 11%

→ 연결되어있는 svr06 노드에 초당 복제 데이터가 144100kb, 동기화 데이터가 18508kb 전송되었으며 이에 대한 동기화 전송 비율은 11%입니다.

명령어

bsrmon에서 제공하는 명령어 입니다.

/start

성능 모니터 기능을 활성화 하고 성능 데이터 집계 및 파일 로깅을 시작합니다. 기본적으로 성능 모니터는 활성화 되어 있습니다.

성능모니터의 활성화는 다음의 과정을 통해 수행됩니다.

  • windows

    • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\bsr\bsrmon_run 레지스트리 값 1로 설정

    • bsr 엔진의 성능데이터 집계 로직 활성화

    • bsrservice의 bsrmon /file 실행 로직 활성화

  • linux

    • /etc/bsr.d/.bsrmon_run 값을 1로 설정

    • bsr 엔진의 성능데이터 집계 로직 활성화

    • /lib/bsr/bsrmon-run 스크립트 실행. bsrmon /file 명령 주기적으로 실행

/stop

성능 모니터를 비활성화 합니다. 비활성화 시 엔진의 성능 데이터 집계 및 파일 로깅이 중단됩니다.

비활성화는 다음의 과정으로 수행합니다.

  • windows

    • HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\bsr\bsrmon_run 레지스트리 값 0으로 설정

    • bsr 엔진의 성능데이터 집계 로직 비활성화

    • bsrservice의 bsrmon /file 실행 로직 비활성화

  • linux

    • /etc/bsr.d/.bsrmon_run 값 0으로 설정

    • bsr 엔진의 성능데이터 집계 로직 비활성화

    • /lib/bsr/bsrmon-run 스크립트 종료

/status

성능 모니터의 동작 상태를 조회합니다.

/file

성능 데이터와 memory 정보를 파일로 저장합니다. 이것은 숨겨진 명령이며 bsrservice 와 bsrmon-run 스크립트에서 사용합니다.

파일 저장 위치는 다음과 같습니다.

  • windows : C:\Program Files\bsr\log\perfmon

  • linux : /var/log/bsr/perfmon/

데이터는 조회 시간과 각 항목별 수치만 저장되며, /show, /watch 또는 /report 명령 실행시 데이터 파싱 후 포멧에 맞춰 출력합니다.

2021-07-20_05:55:21.443 955 955 244480 244480 1042 1042 266752 266752

/show

  • [/t {types[,...]|all}] [/r {resource[,...]|all}] [/j|/json] [/c|/continue]

전체 리소스에 대한 모든 항목의 성능 데이터를 한번에 출력합니다. 출력 대상은 성능 파일에 마지막으로 기록된 데이터입니다.

  • [/t {types[,...]|all}] [/r {resource[,...]|all}]

특정 성능 항목 및 리소스를 지정하여 출력할 수 있는 옵션을 제공합니다. 성능 항목 및 리소스 명은 ','로 구분하며 공백 없이 입력합니다.

  • [/j|/json]

json 형식으로 데이터를 출력합니다.

  • [/c|/continue]

성능 데이터를 주기적으로 출력합니다. 출력 주기는 bsrmon에 설정된 period 값을 따릅니다.

  • 사용 예

모든 리소스의 모든 항목 출력

 bsrmon /show
  • memory, resource 별로 timestamp 가 출력됩니다.

  • iostat, ioclat, reqstat, peer_reqstat, al_stat, resync_ratio 는 vnr 로 구분하여 출력됩니다.

  • memory 성능 데이터는 os(windows/linux)에 따라 출력되는 항목이 다릅니다.

[root@cent79_01 bsr-utils]# bsrmon /show
bsrmon {
    memory {
        system {
            total_memory        7990028; # kbytes
            used_memory 1843756; # kbytes
            free_memory 2511096; # kbytes
            buff/cache  3635176; # kbytes
        }
        module {
            slab {
                bsr_req 15982; # kbytes
                bsr_al  1020; # kbytes
                bsr_bm  7880; # kbytes
                bsr_ee  2384; # kbytes
                total_bio_set   520; # kbytes
                kmalloc 157; # kbytes
            }
            vmalloc     0; # kbytes
            total_page_pool     33344; # kbytes
        }
        user {
            top_process {
                name    gnome-shell;
                pid     2482;
                rsz     250212; # kbytes
                vsz     4082020; # kbytes
            }
            bsr_process {
                name    bsrmon;
                pid     19370;
                rsz     1364; # kbytes
                vsz     12780; # kbytes
            }
        }
        timestamp 2022-10-04_00:10:55.805;
    }
    resource r0 {
        vnr 0 {
            iostat {
                read_iops       0;
                read_iocnt      0;
                read_kbs        0; # kbytes/second
                read_kb 0; # kbytes
                write_iops      0;
                write_iocnt     0;
                write_kbs       0; # kbytes/second
                write_kb        0; # kbytes
            }
            ioclat {
                local_min       0; # usec
                local_max       0; # usec
                local_avg       0; # usec
                master_min      0; # usec
                master_max      0; # usec
                master_avg      0; # usec
            }
            reqstat {
                requests {
                    count       0;
                    before_queue_min    0; # usec
                    before_queue_max    0; # usec
                    before_queue_avg    0; # usec
                    before_al_begin_min 0; # usec
                    before_al_begin_max 0; # usec
                    before_al_begin_avg 0; # usec
                    in_actlog_min       0; # usec
                    in_actlog_max       0; # usec
                    in_actlog_avg       0; # usec
                    submit_min  0; # usec
                    submit_max  0; # usec
                    submit_avg  0; # usec
                    bio_endio_min       0; # usec
                    bio_endio_max       0; # usec
                    bio_endio_avg       0; # usec
                    destroy_min 0; # usec
                    destroy_max 0; # usec
                    destroy_avg 0; # usec
                }
                al_update {
                    count       0;
                    before_bm_write_min 0; # usec
                    before_bm_write_max 0; # usec
                    before_bm_write_avg 0; # usec
                    after_bm_write_min  0; # usec
                    after_bm_write_max  0; # usec
                    after_bm_write_avg  0; # usec
                    after_sync_page_min 0; # usec
                    after_sync_page_max 0; # usec
                    after_sync_page_avg 0; # usec
                }
                peer cent79_03 {
                    pre_send_min        0; # usec
                    pre_send_max        0; # usec
                    pre_send_avg        0; # usec
                    acked_min   0; # usec
                    acked_max   0; # usec
                    acked_avg   0; # usec
                    net_done_min        0; # usec
                    net_done_max        0; # usec
                    net_done_avg        0; # usec
                }
            }
            peer_reqstat {
                peer cent79_03 {
                    count       0;
                    submit_min  0; # usec
                    submit_max  0; # usec
                    submit_avg  0; # usec
                    bio_endio_min       0; # usec
                    bio_endio_max       0; # usec
                    bio_endio_avg       0; # usec
                    destroy_min 0; # usec
                    destroy_max 0; # usec
                    destroy_avg 0; # usec
                }
            }
            al_stat {
                al-extents      6001;
                al_used 0;
                al_used_max     0;
                hits    0;
                hits_total      2064;
                misses  0;
                misses_total    1998;
                starving        0;
                starving_total  0;
                locked  0;
                locked_total    0;
                changed 0;
                changed_total   249;
                al_wait_retry_cnt       0;
                al_wait_total_retry_cnt 0;
                al_wait_max_retry_cnt   0;
                pending_changes 0;
                max_pending_changes     64;
                error {
                    nobufs_starving     0;
                    nobufs_pending_slot 0;
                    nobufs_used_slot    0;
                    busy        0;
                    wouldblock  0;
                }
                flags   __LC_DIRTY,__LC_LOCKED;
            }
            resync_ratio {
                peer cent79_02 {
                    replication	0; # byte/second
                    resync	0; # byte/second
                    resync_ratio	0; # percent
                }
                peer cent79_03 {
                    replication	0; # byte/second
                    resync	0; # byte/second
                    resync_ratio	0; # percent
                }

        }
        vnr 1 {
            iostat {
                read_iops       0;
                read_iocnt      0;
                read_kbs        0; # kbytes/second
                read_kb 0; # kbytes
                write_iops      0;
                write_iocnt     0;
                write_kbs       0; # kbytes/second
                write_kb        0; # kbytes
            }
            ioclat {
                local_min       0; # usec
                local_max       0; # usec
                local_avg       0; # usec
                master_min      0; # usec
                master_max      0; # usec
                master_avg      0; # usec
            }
            reqstat {
                requests {
                    count       0;
                    before_queue_min    0; # usec
                    before_queue_max    0; # usec
                    before_queue_avg    0; # usec
                    before_al_begin_min 0; # usec
                    before_al_begin_max 0; # usec
                    before_al_begin_avg 0; # usec
                    in_actlog_min       0; # usec
                    in_actlog_max       0; # usec
                    in_actlog_avg       0; # usec
                    submit_min  0; # usec
                    submit_max  0; # usec
                    submit_avg  0; # usec
                    bio_endio_min       0; # usec
                    bio_endio_max       0; # usec
                    bio_endio_avg       0; # usec
                    destroy_min 0; # usec
                    destroy_max 0; # usec
                    destroy_avg 0; # usec
                }
                al_update {
                    count       0;
                    before_bm_write_min 0; # usec
                    before_bm_write_max 0; # usec
                    before_bm_write_avg 0; # usec
                    after_bm_write_min  0; # usec
                    after_bm_write_max  0; # usec
                    after_bm_write_avg  0; # usec
                    after_sync_page_min 0; # usec
                    after_sync_page_max 0; # usec
                    after_sync_page_avg 0; # usec
                }
                peer cent79_03 {
                    pre_send_min        0; # usec
                    pre_send_max        0; # usec
                    pre_send_avg        0; # usec
                    acked_min   0; # usec
                    acked_max   0; # usec
                    acked_avg   0; # usec
                    net_done_min        0; # usec
                    net_done_max        0; # usec
                    net_done_avg        0; # usec
                }
            }
            peer_reqstat {
                peer cent79_03 {
                    count       0;
                    submit_min  0; # usec
                    submit_max  0; # usec
                    submit_avg  0; # usec
                    bio_endio_min       0; # usec
                    bio_endio_max       0; # usec
                    bio_endio_avg       0; # usec
                    destroy_min 0; # usec
                    destroy_max 0; # usec
                    destroy_avg 0; # usec
                }
            }
            al_stat {
                al-extents      6001;
                al_used 0;
                al_used_max     0;
                hits    0;
                hits_total      0;
                misses  0;
                misses_total    0;
                starving        0;
                starving_total  0;
                locked  0;
                locked_total    0;
                changed 0;
                changed_total   0;
                al_wait_retry_cnt       0;
                al_wait_total_retry_cnt 0;
                al_wait_max_retry_cnt   0;
                pending_changes 0;
                max_pending_changes     64;
                error {
                    nobufs_starving     0;
                    nobufs_pending_slot 0;
                    nobufs_used_slot    0;
                    busy        0;
                    wouldblock  0;
                }
                flags   NONE;
            }
            resync_ratio {
                peer cent79_02 {
                    replication	0; # byte/second
                    resync	0; # byte/second
                    resync_ratio	0; # percent
                }
                peer cent79_03 {
                    replication	0; # byte/second
                    resync	0; # byte/second
                    resync_ratio	0; # percent
                }

        }
        network {
            peer cent79_02 {
                send    180; # byte/second
                recv    384; # byte/second
            }
            peer cent79_03 {
                send    0; # byte/second
                recv    0; # byte/second
            }
        }
        sendbuf {
            peer cent79_02 {
                ap_in_flight {
                    size        17301504; # bytes
                    count       33;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       33;
                fill    17301504; # bytes
                data_stream {
                    size        20971520; # bytes
                    used        0; # bytes
                    packet {
                        name    P_DATA
                        count   1;
                        size    0; # bytes
                    }
                }
                control_stream {
                    size        5242880; # bytes
                    used        0; # bytes
                }
            }
            peer cent79_03 {
                ap_in_flight {
                    size        0; # bytes
                    count       0;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       0;
                fill    0; # bytes
                data_stream {
                    size        20971520; # bytes
                    used        0; # bytes
                }
                control_stream {
                    size        5242880; # bytes
                    used        0; # bytes
                }
            }
        }
        timestamp 2022-09-29_22:54:25.064;
    }
    resource r1 {
        ...
        timestamp 2022-10-04_00:10:55.805;
    }
}

모든 리소스의 특정 항목 출력

 bsrmon /show /t iostat
[root@cent79_01 bsr-utils]# bsrmon /show /t iostat
bsrmon {
    resource r0 {
        vnr 0 {
            iostat {
                read_iops       0;
                read_iocnt      0;
                read_kbs        0; # kbytes/second
                read_kb 0; # kbytes
                write_iops      0;
                write_iocnt     0;
                write_kbs       0; # kbytes/second
                write_kb        0; # kbytes
            }
        }
        vnr 1 {
            iostat {
                read_iops       0;
                read_iocnt      0;
                read_kbs        0; # kbytes/second
                read_kb 0; # kbytes
                write_iops      0;
                write_iocnt     0;
                write_kbs       0; # kbytes/second
                write_kb        0; # kbytes
            }
        }
        timestamp 2022-09-29_22:54:25.064;
    }
    resource r1 {
        vnr 0 {
            iostat {
                read_iops       0;
                read_iocnt      0;
                read_kbs        0; # kbytes/second
                read_kb 0; # kbytes
                write_iops      0;
                write_iocnt     0;
                write_kbs       0; # kbytes/second
                write_kb        0; # kbytes
            }
        }
        timestamp 2022-10-04_00:17:52.249;
    }
}

특정 리소스의 특정 항목 출력

 bsrmon /show /t network,sendbuf /r r0,r1
[root@cent79_01 bsr-utils]# bsrmon /show /t network,sendbuf /r r0,r1
bsrmon {
    resource r0 {
        network {
            peer cent79_02 {
                send    180; # byte/second
                recv    384; # byte/second
            }
            peer cent79_03 {
                send    0; # byte/second
                recv    0; # byte/second
            }
        }
        sendbuf {
            peer cent79_02 {
                ap_in_flight {
                    size        17301504; # bytes
                    count       33;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       33;
                fill    17301504; # bytes
                data_stream {
                    size        20971520; # bytes
                    used        0; # bytes
                    packet {
                        name    P_DATA
                        count   1;
                        size    0; # bytes
                    }
                }
                control_stream {
                    size        5242880; # bytes
                    used        0; # bytes
                }
            }
            peer cent79_03 {
                ap_in_flight {
                    size        0; # bytes
                    count       0;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       0;
                fill    0; # bytes
                data_stream {
                    size        20971520; # bytes
                    used        0; # bytes
                }
                control_stream {
                    size        5242880; # bytes
                    used        0; # bytes
                }
            }
        }
        timestamp 2022-09-29_22:54:24.021;
    }
    resource r1 {
        network {
            peer cent79_02 {
                send    0; # byte/second
                recv    0; # byte/second
            }
            peer cent79_03 {
                send    0; # byte/second
                recv    0; # byte/second
            }
        }
        sendbuf {
            peer cent79_02 {
                ap_in_flight {
                    size        0; # bytes
                    count       0;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       0;
                fill    0; # bytes
                data_stream {
                    size        0; # bytes
                    used        0; # bytes
                }
                control_stream {
                    size        0; # bytes
                    used        0; # bytes
                }
            }
            peer cent79_03 {
                ap_in_flight {
                    size        0; # bytes
                    count       0;
                }
                rs_in_flight {
                    size        0; # bytes
                    count       0;
                }
                highwater       0;
                fill    0; # bytes
                data_stream {
                    size        0; # bytes
                    used        0; # bytes
                }
                control_stream {
                    size        0; # bytes
                    used        0; # bytes
                }
            }
        }
        timestamp 2022-10-04_00:19:53.026;
    }
}

/watch

  • {types} [/scroll]

types 별로 집계되고 있는 데이터를 실시간으로 출력합니다. /scroll 옵션 사용시 출력을 줄넘김 방식으로 표기합니다.

/report

  • {types} [/f {filename}] [/p {peer_name[,...]}] [/d {YYYY-MM-DD}] [/s {timestamp}] [/e {timestamp}]

types 별로 파일에 기록된 데이터(백업된 파일의 데이터 포함)의 통계를 출력합니다.

  • [/f {filename}]

리포팅 대상 파일 명을 입력합니다. 특정 파일의 통계 산출을 위해 사용됩니다.

  • [/p {peer_name[,...]}]

리포팅 대상 peer 의 이름을 입력합니다. 특정 peer의 통계 산출을 위해 사용합니다. 다중 peer 입력은 공백 없이 콤마(,)로 구분합니다. 미 입력시 서버에 구성된 리소스의 peer를 대상으로 통계를 출력합니다.

  • [/d {YYYY-MM-DD}] [/s {YYYY-MM-DD|hh:mm[:ss]|YYYY-MM-DD_hh:mm[:ss]}] [/e {YYYY-MM-DD|hh:mm[:ss]|YYYY-MM-DD_hh:mm[:ss]}]

특정 기간의 수치를 조회할 수 있는 옵션입니다. /d 옵션으로 해당 날짜에 기록된 데이터의 통계를 출력합니다. /s 는 조회 시작 날짜 및 시간을 지정합니다. /e 로 조회 종료 날짜 및 시간을 지정합니다. /s, /e 옵션 입력시 날짜와 시간은 언더바(_)로 구분합니다.

 사용 예
  • bsrmon /report iostat r0 0 /d 2022-11-01

    • 11월 1일에 수집된 iostat 통계 출력

  • bsrmon /report iostat r0 0 /s 2022-11-01 /e 2022-11-10

    • 11월 1일부터 10일까지 수집된 iostat 통계 출력

  • bsrmon /report iostat r0 0 /s 2022-11-01_09:00 /e 2022-11-10_20:00

    • 11월 1일 9시부터 11월 10일 20시까지 수집된 iostat 통계 출력

  • bsrmon /report iostat r0 0 /s 09:00 /e 20:00

    • 수집되 모든 기간동안 9시부터 20시까지의 iostat 통계를 날짜별로 출력

/set

  • {period, file_size, file_cnt} {value}

모니터링과 관련된 수치를 조정하는 명령입니다.

  • period

파일 저장과 모니터링 주기를 설정합니다. 초 단위로 설정하며 기본 값은 1초 입니다.

  • file_size

파일 롤링 크기를 설정합니다. MB 단위로 설정하며 기본 값은 50MB 입니다.

  • file_cnt

파일 롤링 수를 설정합니다. 기본 값은 3개 입니다.

/get

  • {all, period, file_size, file_cnt}

모니터링과 관련된 수치를 조회하는 명령입니다.

/io_delay_test

  • {flag} {delay point} {delay time}

I/O 성능 저하를 의도적으로 발생시켜 bsr 성능 모니터의 기능을 검증합니다. 개발자용 기능 입니다.

/debug cmds options

debugfs 정보를 조회하기 위한 windows 용 명령어입니다.

  • No labels