您好, 欢迎来到 !    登录 | 注册 | | 设为首页 | 收藏本站

任何方法来提高文件读取性能,都比缓冲读取器更好

任何方法来提高文件读取性能,都比缓冲读取器更好

由于OP渴望了解如何使用NIO来完成此操作。

由于文件很小,很难看到差异,但是可以测量。

public static void main(String... args) throws IOException {
    PrintWriter pw = new PrintWriter("abc.log");
    for (int i = 0; i < 100 * 1000; i++) {
        pw.println("0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789");
    }
    pw.close();

    long start2 = System.nanoTime();
    int count2 = 0;
    BufferedReader br = new BufferedReader(new FileReader("abc.log"));
    while (br.readLine() != null) count2++;
    br.close();
    long time2 = System.nanoTime() - start2;
    System.out.printf("IO: Took %,d ms to read %,d lines%n", time2 / 1000 / 1000, count2);

    long start = System.nanoTime();
    FileChannel fc = new FileInputStream("abc.log").getChannel();
    ByteBuffer bb = ByteBuffer.allocateDirect((int) fc.size());
    fc.read(bb);
    fc.close();
    bb.flip();

    CharBuffer cb = ByteBuffer.allocateDirect(bb.remaining() * 2).order(ByteOrder.nativeOrder()).asCharBuffer();
    CharsetDecoder cd = Charset.forName("UTF-8").newDecoder();
    cd.decode(bb, cb, true);
    cb.flip();
    StringBuilder sb = new StringBuilder();
    int count = 0;
    while (cb.remaining() > 0) {
        char ch = cb.get();
        if (isEndOfLine(cb, ch)) {
            // process sb
            count++;
            sb.setLength(0);
        } else {
            sb.append(ch);
        }
    }
    long time = System.nanoTime() - start;
    System.out.printf("NIO as UTF-8: Took %,d ms to read %,d lines%n", time / 1000 / 1000, count);

    long start3 = System.nanoTime();
    FileChannel fc2 = new FileInputStream("abc.log").getChannel();
    MappedByteBuffer bb2 = fc2.map(FileChannel.MapMode.READ_ONLY, 0, fc2.size());
    bb.flip();
    StringBuilder sb3 = new StringBuilder();
    int count3 = 0;
    while (bb2.remaining() > 0) {
        char ch = (char) bb2.get();
        if (isEndOfLine(bb2, ch)) {
            // process sb
            count3++;
            sb3.setLength(0);
        } else {
            sb3.append(ch);
        }
    }
    fc2.close();
    long time3 = System.nanoTime() - start3;
    System.out.printf("NIO as ISO-8859-1: Took %,d ms to read %,d lines%n", time3 / 1000 / 1000, count3);


}

private static boolean isEndOfLine(CharBuffer cb, char ch) {
    if (ch == '\r') {
        if (cb.remaining() >= 1 && cb.get() == '\n') {
            return true;
        }
        cb.position(cb.position() - 1);
        return true;
    } else if (ch == '\n') {
        return true;
    }
    return false;
}

private static boolean isEndOfLine(ByteBuffer bb, char ch) {
    if (ch == '\r') {
        if (bb.remaining() >= 1 && bb.get() == '\n') {
            return true;
        }
        bb.position(bb.position() - 1);
        return true;
    } else if (ch == '\n') {
        return true;
    }
    return false;
}

打印每一行的长度为102字节,因此文件约为10 MB。

IO: Took 112 ms to read 100,000 lines
NIO as UTF-8: Took 207 ms to read 100,000 lines
NIO as ISO-8859-1: Took 87 ms to read 100,000 lines

如前所述,使用NIO节省35毫秒的额外复杂性不太值得。

顺便说一句:如果您有硬盘驱动器,并且文件不在内存中,则仅驱动器的速度至关重要。

其他 2022/1/1 18:27:58 有442人围观

撰写回答


你尚未登录,登录后可以

和开发者交流问题的细节

关注并接收问题和回答的更新提醒

参与内容的编辑和改进,让解决方法与时俱进

请先登录

推荐问题


联系我
置顶