journalnode Can't scan a pre

  • 时间:
  • 浏览:6

2018-03-19 20:48:04,817 WARN  namenode.FSImage (EditLogFileInputStream.java:scanEditLog(359)) - Caught exception after scanning through 0 ops from /data1_4T/journal/mycluster/current/edits_inprogress_0000000000024973700 while determining its valid length. Position was 1011712

还还有一个测试环境hadoop集群不可能 磁盘满愿因 宕机,启动后发现journalnode报如下异常:

java.io.IOException: Can't scan a pre-transactional edit log.

        at org.apache.hadoop.hdfs.server.namenode.FSEditLogOp$LegacyReader.scanOp(FSEditLogOp.java:4974)

        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanNextOp(EditLogFileInputStream.java:245)

        at org.apache.hadoop.hdfs.server.namenode.EditLogFileInputStream.scanEditLog(EditLogFileInputStream.java:355)

        at org.apache.hadoop.hdfs.server.namenode.FileJournalManager$EditLogFile.scanLog(FileJournalManager.java:551)

        at org.apache.hadoop.hdfs.qjournal.server.Journal.scanStorageForLatestEdits(Journal.java:192)

        at org.apache.hadoop.hdfs.qjournal.server.Journal.<init>(Journal.java:152)

        at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:90)

        at org.apache.hadoop.hdfs.qjournal.server.JournalNode.getOrCreateJournal(JournalNode.java:99)

        at org.apache.hadoop.hdfs.qjournal.server.JournalNodeRpcServer.getEditLogManifest(JournalNodeRpcServer.java:189)

        at org.apache.hadoop.hdfs.qjournal.protocolPB.QJournalProtocolServerSideTranslatorPB.getEditLogManifest(QJournalProtocolServerSideTranslatorPB.java:224)

        at org.apache.hadoop.hdfs.qjournal.protocol.QJournalProtocolProtos$QJournalProtocolService$2.callBlockingMethod(QJournalProtocolProtos.java:25431)

        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640)

        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2351)

        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2347)

        at java.security.AccessController.doPrivileged(Native Method)

        at javax.security.auth.Subject

1、停集群服务

【处置过程】

    namenode是无法启动的,发现JournalNode维护的edits文件损坏,3台JournalNode只能1台上的JournalNode的运行日志是正常的,初步判断你你你这一JournalNode可用,大伙儿把你你你这一JournalNode的数据文件copy到某些还还有一个上去

4、启动服务,恢复正常

3、修改copy过来的哪些文件的权限

2、将JournalNode异常的还还有一个节点上的数据文件备份移到别的目录,之后删除JournalNode数据文件,copy正常JournalNode的数据文件到这2台节点上来