CPU过高问题分析,Thrift接口的一些使用问题及相关注意事项的详解

HBase对于非Java语言提供了Thrift接口援助,这里结合对HBase
Thrift接口(HBase版本为0.92.1)的利用经验,计算个中境遇的局地主题素材及其有关怀意事项。
1. 字节的存放顺序 HBase中,由于row(row key和column family、column qualifier、time
stamp)是遵照字典序实行排序的,因而,对于short、int、long等品种的多少,通过Bytes.toBytes(…)调换来byte数组后,必须遵守大端方式(高字节在低地址,低字节在高地址)贮存。对于value,也是一致的道理。因而,在应用Thrift
API(C++、Php、Python等)情势时,最棒对于row和value都统一依据大端进行pack和unpack管理。
比方,C++中,对于int型变量,经过以下办法转变为字典序:

HBase Thrift2 CPU过高难题深入分析

图片 1HBase
Thrift2 CPU过高难点深入分析.pdf

HBase Thrift2 CPU过高难题浅析

图片 2HBase
Thrift2 CPU过高难题分析.pdf

HBase Thrift2 CPU过高难点浅析

图片 3CPU过高问题分析,Thrift接口的一些使用问题及相关注意事项的详解。HBase
Thrift2 CPU过高难点剖判.pdf

复制代码 代码如下:

目录

目录1

1.现象描述1

2.难题一定2

3.消除方案5

4.相关代码5

目录

目录1

1.场景描述1

2.主题素材一定2

3.缓和方案5

4.辅车相依代码5

目录

目录1

1.光景描述1

2.标题一定2

3.解决方案5

4.连锁代码5

string key;
  int32_t timestamp = 1352563200;
  const char* pTs =(const char*) &timestamp;
  size_t n = sizeof(int32_t);
  key.append(pTs, n);

1.情形描述

外边连接9090端口均超时,但telnet端口总是成功。使用top命令观察,开掘单个线程的CPU最高达99.99%,但并不两次三番99.9%,而是在波动。当迁走往该机器的流量后,可以访问成功,但如故有逾期,读超时比写超时多:

#./hbase_stress–hbase=110.13.136.207:9090–test=2–timeout=10

[2016-11-2710:15:21/771][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

[2016-11-2710:15:31/775][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND

20727zhangsan20010.843g9.263g26344R99.926.41448:00java

20729zhangsan20010.843g9.263g26344R99.926.41448:00java

20730zhangsan20010.843g9.263g26344R99.926.41449:10java

20728zhangsan20010.843g9.263g26344R99.826.41448:00java

20693zhangsan20010.843g9.263g26344S0.026.40:00.00java

20727zhangsan20010.843g9.263g26344R75.526.41448:06java

20728zhangsan20010.843g9.263g26344R75.226.41448:06java

20729zhangsan20010.843g9.263g26344R75.226.41448:06java

20730zhangsan20010.843g9.263g26344R75.226.41449:15java

20716zhangsan20010.843g9.263g26344S24.926.493:48.75java

1.风貌描述

外边连接9090端口均超时,但telnet端口总是成功。使用top命令观察,开掘单个线程的CPU最高达99.99%,但并不一而再99.9%,而是在多事。当迁走往该机器的流量后,能够访谈成功,但照旧有逾期,读超时比写超时多:

#./hbase_stress–hbase=110.13.136.207:9090–test=2–timeout=10

[2016-11-2710:15:21/771][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

[2016-11-2710:15:31/775][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND

20727zhangsan20010.843g9.263g26344R99.926.41448:00java

20729zhangsan20010.843g9.263g26344R99.926.41448:00java

20730zhangsan20010.843g9.263g26344R99.926.41449:10java

20728zhangsan20010.843g9.263g26344R99.826.41448:00java

20693zhangsan20010.843g9.263g26344S0.026.40:00.00java

20727zhangsan20010.843g9.263g26344R75.526.41448:06java

20728zhangsan20010.843g9.263g26344R75.226.41448:06java

20729zhangsan20010.843g9.263g26344R75.226.41448:06java

20730zhangsan20010.843g9.263g26344R75.226.41449:15java

20716zhangsan20010.843g9.263g26344S24.926.493:48.75java

1.景色描述

外边连接9090端口均超时,但telnet端口总是成功。使用top命令观看,开掘单个线程的CPU最高达99.99%,但并不三番五次99.9%,而是在动荡。当迁走往该机器的流量后,能够访问成功,但如故有逾期,读超时比写超时多:

#./hbase_stress–hbase=110.13.136.207:9090–test=2–timeout=10

[2016-11-2710:15:21/771][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

[2016-11-2710:15:31/775][139756154767104/31562][ERROR][hbase_stress.cpp:302]TransportException(thrift://110.13.136.207:9090):EAGAIN(timedout)

PIDUSERPRNIVIRTRESSHRS%CPU%MEMTIME+COMMAND

20727zhangsan20010.843g9.263g26344R99.926.41448:00java

20729zhangsan20010.843g9.263g26344R99.926.41448:00java

20730zhangsan20010.843g9.263g26344R99.926.41449:10java

20728zhangsan20010.843g9.263g26344R99.826.41448:00java

20693zhangsan20010.843g9.263g26344S0.026.40:00.00java

20727zhangsan20010.843g9.263g26344R75.526.41448:06java

20728zhangsan20010.843g9.263g26344R75.226.41448:06java

20729zhangsan20010.843g9.263g26344R75.226.41448:06java

20730zhangsan20010.843g9.263g26344R75.226.41449:15java

20716zhangsan20010.843g9.263g26344S24.926.493:48.75java

由此以下方法将字典序转变为int:

2.主题材料一定

使用ps命令寻找CPU最多的线程,和top展现的一模二样:

$ps-mp20693-oTHREAD,tid,time|sort-rn

zhangsan18.819—-207301-00:11:23

zhangsan18.719—-207291-00:10:13

zhangsan18.719—-207281-00:10:13

zhangsan18.719—-207271-00:10:13

zhangsan16.119-futex_–2073120:44:51

zhangsan5.219-futex_–2073206:46:39

然后借助jstack,开采为GC进度:

"Gangworker#0(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d4000nid=0x50f7runnable

"Gangworker#1(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d5800nid=0x50f8runnable

"Gangworker#2(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d7800nid=0x50f9runnable

"Gangworker#3(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d9000nid=0x50farunnable

动用jstat工具查看GC,意况很不乐观,难题正是有GC引起的:

$jstat-gcutil206931000100

S0S1EOMCCSYGCYGCTFGCFGCTGCT

0.0099.67100.00100.0098.0894.4142199369.1322708434869.60135238.733

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

$jstat-gccapacity20693

NGCMNNGCMXNGCS0CS1CECOGCMNOGCMXOGCOCMCMNMCMXMCCCSMNCCSMXCCSCYGCFGC

191808.01107520.01107520.0110720.0110720.0886080.0383680.08094144.08094144.08094144.00.01077248.031584.00.01048576.03424.04219927156

$jstat-gcold20693

MCMUCCSCCCSUOCOUYGCFGCFGCTGCT

31584.030978.73424.03232.78094144.08094144.0421992717434964.34735333.479

$jstat-gcoldcapacity20693

OGCMNOGCMXOGCOCYGCFGCFGCTGCT

383680.08094144.08094144.08094144.0421992719234982.62335351.755

$jstat-gcnewcapacity20693

NGCMNNGCMXNGCS0CMXS0CS1CMXS1CECMXECYGCFGC

191808.01107520.01107520.0110720.0110720.0110720.0110720.0886080.0886080.04219927202

$jstat-gc20693

S0CS1CS0US1UECEUOCOUMCMUCCSCCCSUYGCYGCTFGCFGCTGCT

110720.0110720.00.0110395.9886080.0886080.08094144.08094144.031584.030978.73424.03232.742199369.1322720634996.53835365.671

$jstat-gcnew20693

S0CS1CS0US1UTTMTTDSSECEUYGCYGCT

110720.0110720.00.0110396.96655360.0886080.0886080.042199369.132

采纳lsof展现该进度的连接数相当少,完全在商洛限制内,难点应有是有目标不可能被回收。使用jmap查看内存详细情况,先看堆的行使情形:

$jmap-heap20693

AttachingtoprocessID20693,pleasewait…

Debuggerattachedsuccessfully.

Servercompilerdetected.

JVMversionis25.77-b03

usingparallelthreadsinthenewgeneration.

usingthread-localobjectallocation.

ConcurrentMark-SweepGC

HeapConfiguration:

MinHeapFreeRatio=40

MaxHeapFreeRatio=70

MaxHeapSize=9422503936(8986.0MB)

NewSize=196411392(187.3125MB)

MaxNewSize=1134100480(1081.5625MB)

OldSize=392888320(374.6875MB)

NewRatio=2

SurvivorRatio=8

MetaspaceSize=21807104(20.796875MB)

CompressedClassSpaceSize=1073741824(1024.0MB)

MaxMetaspaceSize=17592186044415MB

G1HeapRegionSize=0(0.0MB)

HeapUsage:

NewGeneration(Eden+1SurvivorSpace):

capacity=1020723200(973.4375MB)

used=1020398064(973.1274261474609MB)

free=325136(0.3100738525390625MB)

99.96814650632022%used

EdenSpace:

capacity=907345920(865.3125MB)

used=907345920(865.3125MB)

free=0(0.0MB)

100.0%used

FromSpace:

capacity=113377280(108.125MB)

used=113052144(107.81492614746094MB)

free=325136(0.3100738525390625MB)

99.71322649476156%used

ToSpace:

capacity=113377280(108.125MB)

used=0(0.0MB)

free=113377280(108.125MB)

0.0%used

concurrentmark-sweepgeneration:

capacity=8288403456(7904.4375MB)

used=8288403424(7904.437469482422MB)

free=32(3.0517578125E-5MB)

99.9999996139184%used

10216internedStringsoccupying934640bytes.

尤其查看对象的图景:

$jmap-histo20693

num#instances#bytesclassname

———————————————-

1:728352122518411456[B

2:498271471993085880java.util.TreeMap$Entry

3:12855993617087664java.util.TreeMap

4:4285217445662568org.apache.hadoop.hbase.client.ClientScanner

5:4285222377099536org.apache.hadoop.hbase.client.Scan

6:4284875377069000org.apache.hadoop.hbase.client.ScannerCallable

7:4285528342921344[Ljava.util.HashMap$Node;

8:4284880308511360org.apache.hadoop.hbase.client.ScannerCallableWithReplicas

9:8570671274261472java.util.LinkedList

10:4285579205707792java.util.HashMap

11:4285283205693584org.apache.hadoop.hbase.client.RpcRetryingCaller

12:3820914152836560org.apache.hadoop.hbase.filter.SingleColumnValueFilter

13:4291904137340928java.util.concurrent.ConcurrentHashMap$Node

14:8570636137130176java.util.TreeMap$EntrySet

15:4285278137128896org.apache.hadoop.hbase.io.TimeRange

16:8570479137127664java.util.concurrent.atomic.AtomicBoolean

17:289140992525088org.apache.hadoop.hbase.NoTagsKeyValue

18:428654068584640java.lang.Integer

19:428529868564768java.util.TreeMap$KeySet

20:428527568564400java.util.TreeSet

21:428500668560096java.util.HashSet

22:428485168557616java.util.HashMap$KeySet

23:317611850817888org.apache.hadoop.hbase.filter.BinaryComparator

24:10933607600[Ljava.util.concurrent.ConcurrentHashMap$Node;

25:41877518479112[Lorg.apache.hadoop.hbase.Cell;

26:67144317693224[C

27:41878116751240org.apache.hadoop.hbase.client.Result

28:66973916073736java.lang.String

29:64479615475104org.apache.hadoop.hbase.filter.SubstringComparator

30:41913410059216java.util.LinkedList$Node

为使系统能够平常办事,先实践治标不治本的方案:监察和控制GC,定期重启HBaseThrift2进度,然后再搜索根本原因到达治本的指标。

从地点jmap的出口来看,推断是还是不是因为额scanner未有被关闭导致的。而scanner没有被关门的来头有七个:一是客户端程序难题尚未关闭,也正是有内部存款和储蓄器泄漏了,二是客户端程序极度导致没机遇关闭。

查阅客户端源代码,确实存在openScanner的不胜时未关门。其余客户端被kill掉或断电等,也会促成不可能自由,那或多或少是HBaseThrift2得化解的难题。

2.难点一定

运用ps命令搜索CPU最多的线程,和top显示的一致:

$ps-mp20693-oTHREAD,tid,time|sort-rn

zhangsan18.819—-207301-00:11:23

zhangsan18.719—-207291-00:10:13

zhangsan18.719—-207281-00:10:13

zhangsan18.719—-207271-00:10:13

zhangsan16.119-futex_–2073120:44:51

zhangsan5.219-futex_–2073206:46:39

接下来借助jstack,发现为GC进程:

"Gangworker#0(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d4000nid=0x50f7runnable

"Gangworker#1(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d5800nid=0x50f8runnable

"Gangworker#2(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d7800nid=0x50f9runnable

"Gangworker#3(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d9000nid=0x50farunnable

利用jstat工具查看GC,情形很不明朗,难题正是有GC引起的:

$jstat-gcutil206931000100

S0S1EOMCCSYGCYGCTFGCFGCTGCT

0.0099.67100.00100.0098.0894.4142199369.1322708434869.60135238.733

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

$jstat-gccapacity20693

NGCMNNGCMXNGCS0CS1CECOGCMNOGCMXOGCOCMCMNMCMXMCCCSMNCCSMXCCSCYGCFGC

191808.01107520.01107520.0110720.0110720.0886080.0383680.08094144.08094144.08094144.00.01077248.031584.00.01048576.03424.04219927156

$jstat-gcold20693

MCMUCCSCCCSUOCOUYGCFGCFGCTGCT

31584.030978.73424.03232.78094144.08094144.0421992717434964.34735333.479

$jstat-gcoldcapacity20693

OGCMNOGCMXOGCOCYGCFGCFGCTGCT

383680.08094144.08094144.08094144.0421992719234982.62335351.755

$jstat-gcnewcapacity20693

NGCMNNGCMXNGCS0CMXS0CS1CMXS1CECMXECYGCFGC

191808.01107520.01107520.0110720.0110720.0110720.0110720.0886080.0886080.04219927202

$jstat-gc20693

S0CS1CS0US1UECEUOCOUMCMUCCSCCCSUYGCYGCTFGCFGCTGCT

110720.0110720.00.0110395.9886080.0886080.08094144.08094144.031584.030978.73424.03232.742199369.1322720634996.53835365.671

$jstat-gcnew20693

S0CS1CS0US1UTTMTTDSSECEUYGCYGCT

110720.0110720.00.0110396.96655360.0886080.0886080.042199369.132

选拔lsof突显该进度的连接数相当的少,完全在平安限制内,难点应当是有目的无法被回收。使用jmap查看内部存款和储蓄器详细的情况,先看堆的利用情形:

$jmap-heap20693

AttachingtoprocessID20693,pleasewait…

Debuggerattachedsuccessfully.

Servercompilerdetected.

JVMversionis25.77-b03

usingparallelthreadsinthenewgeneration.

usingthread-localobjectallocation.

ConcurrentMark-SweepGC

HeapConfiguration:

MinHeapFreeRatio=40

MaxHeapFreeRatio=70

MaxHeapSize=9422503936(8986.0MB)

NewSize=196411392(187.3125MB)

MaxNewSize=1134100480(1081.5625MB)

OldSize=392888320(374.6875MB)

NewRatio=2

SurvivorRatio=8

MetaspaceSize=21807104(20.796875MB)

CompressedClassSpaceSize=1073741824(1024.0MB)

MaxMetaspaceSize=17592186044415MB

G1HeapRegionSize=0(0.0MB)

HeapUsage:

NewGeneration(Eden+1SurvivorSpace):

capacity=1020723200(973.4375MB)

used=1020398064(973.1274261474609MB)

free=325136(0.3100738525390625MB)

99.96814650632022%used

EdenSpace:

capacity=907345920(865.3125MB)

used=907345920(865.3125MB)

free=0(0.0MB)

100.0%used

FromSpace:

capacity=113377280(108.125MB)

used=113052144(107.81492614746094MB)

free=325136(0.3100738525390625MB)

99.71322649476156%used

ToSpace:

capacity=113377280(108.125MB)

used=0(0.0MB)

free=113377280(108.125MB)

0.0%used

concurrentmark-sweepgeneration:

capacity=8288403456(7904.4375MB)

used=8288403424(7904.437469482422MB)

free=32(3.0517578125E-5MB)

99.9999996139184%used

10216internedStringsoccupying934640bytes.

更为查看对象的情状:

$jmap-histo20693

num#instances#bytesclassname

———————————————-

1:728352122518411456[B

2:498271471993085880java.util.TreeMap$Entry

3:12855993617087664java.util.TreeMap

4:4285217445662568org.apache.hadoop.hbase.client.ClientScanner

5:4285222377099536org.apache.hadoop.hbase.client.Scan

6:4284875377069000org.apache.hadoop.hbase.client.ScannerCallable

7:4285528342921344[Ljava.util.HashMap$Node;

8:4284880308511360org.apache.hadoop.hbase.client.ScannerCallableWithReplicas

9:8570671274261472java.util.LinkedList

10:4285579205707792java.util.HashMap

11:4285283205693584org.apache.hadoop.hbase.client.RpcRetryingCaller

12:3820914152836560org.apache.hadoop.hbase.filter.SingleColumnValueFilter

13:4291904137340928java.util.concurrent.ConcurrentHashMap$Node

14:8570636137130176java.util.TreeMap$EntrySet

15:4285278137128896org.apache.hadoop.hbase.io.TimeRange

16:8570479137127664java.util.concurrent.atomic.AtomicBoolean

17:289140992525088org.apache.hadoop.hbase.NoTagsKeyValue

18:428654068584640java.lang.Integer

19:428529868564768java.util.TreeMap$KeySet

20:428527568564400java.util.TreeSet

21:428500668560096java.util.HashSet

22:428485168557616java.util.HashMap$KeySet

23:317611850817888org.apache.hadoop.hbase.filter.BinaryComparator

24:10933607600[Ljava.util.concurrent.ConcurrentHashMap$Node;

25:41877518479112[Lorg.apache.hadoop.hbase.Cell;

26:67144317693224[C

27:41878116751240org.apache.hadoop.hbase.client.Result

28:66973916073736java.lang.String

29:64479615475104org.apache.hadoop.hbase.filter.SubstringComparator

30:41913410059216java.util.LinkedList$Node

为使系统能够符合规律专门的工作,先执行治标不治本的方案:监察和控制GC,定时重启HBaseThrift2进程,然后再找寻根本原因达到治本的目标。

从地方jmap的出口来看,估量是还是不是因为额scanner未有被关闭导致的。而scanner没有被关门的来头有多个:一是客户端程序难点绝非关闭,约等于有内部存款和储蓄器泄漏了,二是客户端程序相当导致没机缘关闭。

翻看客户端源代码,确实存在openScanner的百般时未关门。别的客户端被kill掉或断电等,也会促成不也许自由,那点是HBaseThrift2得化解的标题。

2.主题材料一定

采纳ps命令搜索CPU最多的线程,和top展现的同样:

$ps-mp20693-oTHREAD,tid,time|sort-rn

zhangsan18.819—-207301-00:11:23

zhangsan18.719—-207291-00:10:13

zhangsan18.719—-207281-00:10:13

zhangsan18.719—-207271-00:10:13

zhangsan16.119-futex_–2073120:44:51

zhangsan5.219-futex_–2073206:46:39

然后借助jstack,发掘为GC进度:

"Gangworker#0(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d4000nid=0x50f7runnable

"Gangworker#1(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d5800nid=0x50f8runnable

"Gangworker#2(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d7800nid=0x50f9runnable

"Gangworker#3(ParallelCMSThreads)"os_prio=0tid=0x00007fb7200d9000nid=0x50farunnable

行使jstat工具查看GC,情形很不乐观,难题便是有GC引起的:

$jstat-gcutil206931000100

S0S1EOMCCSYGCYGCTFGCFGCTGCT

0.0099.67100.00100.0098.0894.4142199369.1322708434869.60135238.733

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

0.0099.67100.00100.0098.0894.4142199369.1322708434870.44835239.580

$jstat-gccapacity20693

NGCMNNGCMXNGCS0CS1CECOGCMNOGCMXOGCOCMCMNMCMXMCCCSMNCCSMXCCSCYGCFGC

191808.01107520.01107520.0110720.0110720.0886080.0383680.08094144.08094144.08094144.00.01077248.031584.00.01048576.03424.04219927156

$jstat-gcold20693

MCMUCCSCCCSUOCOUYGCFGCFGCTGCT

31584.030978.73424.03232.78094144.08094144.0421992717434964.34735333.479

$jstat-gcoldcapacity20693

OGCMNOGCMXOGCOCYGCFGCFGCTGCT

383680.08094144.08094144.08094144.0421992719234982.62335351.755

$jstat-gcnewcapacity20693

NGCMNNGCMXNGCS0CMXS0CS1CMXS1CECMXECYGCFGC

191808.01107520.01107520.0110720.0110720.0110720.0110720.0886080.0886080.04219927202

$jstat-gc20693

S0CS1CS0US1UECEUOCOUMCMUCCSCCCSUYGCYGCTFGCFGCTGCT

110720.0110720.00.0110395.9886080.0886080.08094144.08094144.031584.030978.73424.03232.742199369.1322720634996.53835365.671

$jstat-gcnew20693

S0CS1CS0US1UTTMTTDSSECEUYGCYGCT

110720.0110720.00.0110396.96655360.0886080.0886080.042199369.132

利用lsof展现该进度的连接数十分的少,完全在海东范围内,难题应该是有对象无法被回收。使用jmap查看内部存款和储蓄器详细的情况,先看堆的运用情状:

$jmap-heap20693

AttachingtoprocessID20693,pleasewait…

Debuggerattachedsuccessfully.

Servercompilerdetected.

JVMversionis25.77-b03

usingparallelthreadsinthenewgeneration.

usingthread-localobjectallocation.

ConcurrentMark-SweepGC

HeapConfiguration:

MinHeapFreeRatio=40

MaxHeapFreeRatio=70

MaxHeapSize=9422503936(8986.0MB)

NewSize=196411392(187.3125MB)

MaxNewSize=1134100480(1081.5625MB)

OldSize=392888320(374.6875MB)

NewRatio=2

SurvivorRatio=8

MetaspaceSize=21807104(20.796875MB)

CompressedClassSpaceSize=1073741824(1024.0MB)

MaxMetaspaceSize=17592186044415MB

G1HeapRegionSize=0(0.0MB)

HeapUsage:

NewGeneration(Eden+1SurvivorSpace):

capacity=1020723200(973.4375MB)

used=1020398064(973.1274261474609MB)

free=325136(0.3100738525390625MB)

99.96814650632022%used

EdenSpace:

capacity=907345920(865.3125MB)

used=907345920(865.3125MB)

free=0(0.0MB)

100.0%used

FromSpace:

capacity=113377280(108.125MB)

used=113052144(107.81492614746094MB)

free=325136(0.3100738525390625MB)

99.71322649476156%used

ToSpace:

capacity=113377280(108.125MB)

used=0(0.0MB)

free=113377280(108.125MB)

0.0%used

concurrentmark-sweepgeneration:

capacity=8288403456(7904.4375MB)

used=8288403424(7904.437469482422MB)

free=32(3.0517578125E-5MB)

99.9999996139184%used

10216internedStringsoccupying934640bytes.

越来越查看对象的情形:

$jmap-histo20693

num#instances#bytesclassname

———————————————-

1:728352122518411456[B

2:498271471993085880java.util.TreeMap$Entry

3:12855993617087664java.util.TreeMap

4:4285217445662568org.apache.hadoop.hbase.client.ClientScanner

5:4285222377099536org.apache.hadoop.hbase.client.Scan

6:4284875377069000org.apache.hadoop.hbase.client.ScannerCallable

7:4285528342921344[Ljava.util.HashMap$Node;

8:4284880308511360org.apache.hadoop.hbase.client.ScannerCallableWithReplicas

9:8570671274261472java.util.LinkedList

10:4285579205707792java.util.HashMap

11:4285283205693584org.apache.hadoop.hbase.client.RpcRetryingCaller

12:3820914152836560org.apache.hadoop.hbase.filter.SingleColumnValueFilter

13:4291904137340928java.util.concurrent.ConcurrentHashMap$Node

14:8570636137130176java.util.TreeMap$EntrySet

15:4285278137128896org.apache.hadoop.hbase.io.TimeRange

16:8570479137127664java.util.concurrent.atomic.AtomicBoolean

17:289140992525088org.apache.hadoop.hbase.NoTagsKeyValue

18:428654068584640java.lang.Integer

19:428529868564768java.util.TreeMap$KeySet

20:428527568564400java.util.TreeSet

21:428500668560096java.util.HashSet

22:428485168557616java.util.HashMap$KeySet

23:317611850817888org.apache.hadoop.hbase.filter.BinaryComparator

24:10933607600[Ljava.util.concurrent.ConcurrentHashMap$Node;

25:41877518479112[Lorg.apache.hadoop.hbase.Cell;

26:67144317693224[C

27:41878116751240org.apache.hadoop.hbase.client.Result

28:66973916073736java.lang.String

29:64479615475104org.apache.hadoop.hbase.filter.SubstringComparator

30:41913410059216java.util.LinkedList$Node

为使系统能够健康办事,先实行治标不治本的方案:监察和控制GC,定期重启HBaseThrift2进度,然后再找寻根本原因到达治本的指标。

从上面jmap的输出来看,猜想是或不是因为额scanner未有被关门导致的。而scanner未有被关闭的由来有七个:一是客户端程序难点远非休憩,也正是有内部存款和储蓄器泄漏了,二是客户端程序非常导致没机遇关闭。

翻看客户端源代码,确实存在openScanner的丰硕时未关门。其他客户端被kill掉或断电等,也会招致力不胜任自由,那或多或少是HBaseThrift2得化解的难题。

发表评论

电子邮件地址不会被公开。 必填项已用*标注

网站地图xml地图