Netty.docs：微基准测试

Netty 有一个名为“netty-microbench”的模块，它执行一系列微基准测试。它建立在 OpenJDK JMH 之上，这是 HotSpot 首选的微基准测试解决方案。它包含“内置电池”，因此您无需额外的依赖项即可开始使用。

运行基准测试

您可以通过 maven 从命令行或直接在 IDE 中运行基准测试。要使用默认设置运行所有测试，请使用 mvn -DskipTests=false test。您需要明确设置 skipTests=false，因为我们不希望在常规测试运行期间将（可能耗时的）微基准测试作为单元测试执行。

如果一切顺利，您将看到 JMH 对 fork 数量执行预热和基准测试迭代，并向您展示一个简洁的摘要。以下是典型基准测试运行的样子（您将在输出中看到很多这样的运行）

# Fork: 2 of 2
# Warmup: 10 iterations, 1 s each
# Measurement: 10 iterations, 1 s each
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
# Running: io.netty.microbench.buffer.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_1_0
# Warmup Iteration   1: 8454.103 ops/ms
# Warmup Iteration   2: 11551.524 ops/ms
# Warmup Iteration   3: 11677.575 ops/ms
# Warmup Iteration   4: 11404.954 ops/ms
# Warmup Iteration   5: 11553.299 ops/ms
# Warmup Iteration   6: 11514.766 ops/ms
# Warmup Iteration   7: 11661.768 ops/ms
# Warmup Iteration   8: 11667.577 ops/ms
# Warmup Iteration   9: 11551.240 ops/ms
# Warmup Iteration  10: 11692.991 ops/ms
Iteration   1: 11633.877 ops/ms
Iteration   2: 11740.063 ops/ms
Iteration   3: 11751.798 ops/ms
Iteration   4: 11260.071 ops/ms
Iteration   5: 11461.010 ops/ms
Iteration   6: 11642.912 ops/ms
Iteration   7: 11808.595 ops/ms
Iteration   8: 11683.780 ops/ms
Iteration   9: 11750.292 ops/ms
Iteration  10: 11769.986 ops/ms

Result : 11650.238 ±(99.9%) 229.698 ops/ms
  Statistics: (min, avg, max) = (11260.071, 11650.238, 11808.595), stdev = 169.080
  Confidence interval (99.9%): [11420.540, 11879.937]

最后，测试输出将类似于此（具体取决于您的系统设置和配置）

Benchmark                                                                Mode   Samples         Mean   Mean error    Units
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_1_0          thrpt        20    11658.812      120.728   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_2_256        thrpt        20    10308.626      147.528   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_3_1024       thrpt        20     8855.815       55.933   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_4_4096       thrpt        20     5545.538     1279.721   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_5_16384      thrpt        20     6741.581       75.975   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledDirectAllocAndFree_6_65536      thrpt        20     7252.869       70.609   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_1_0            thrpt        20     9750.225       73.900   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_2_256          thrpt        20     9936.639      657.818   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_3_1024         thrpt        20     8903.130      197.533   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_4_4096         thrpt        20     6664.157       74.163   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_5_16384        thrpt        20     6374.924      337.869   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.pooledHeapAllocAndFree_6_65536        thrpt        20     6386.337       44.960   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_1_0        thrpt        20     2137.241       30.792   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_2_256      thrpt        20     1873.727       41.843   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_3_1024     thrpt        20     1902.025       34.473   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_4_4096     thrpt        20     1534.347       20.509   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_5_16384    thrpt        20      838.804       12.575   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledDirectAllocAndFree_6_65536    thrpt        20      276.976        3.021   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_1_0          thrpt        20    35820.568      259.187   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_2_256        thrpt        20    19660.951      295.012   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_3_1024       thrpt        20     6264.614       77.704   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_4_4096       thrpt        20     2921.598       95.492   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_5_16384      thrpt        20      991.631       49.220   ops/ms
i.n.m.b.ByteBufAllocatorBenchmark.unpooledHeapAllocAndFree_6_65536      thrpt        20      261.718       11.108   ops/ms
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 993.382 sec - in io.netty.microbench.buffer.ByteBufAllocatorBenchmark

您还可以直接从您的 IDE 运行基准测试。如果您已经导入了 netty 父项目，请打开 microbench 子项目并导航到 src/main/java/io/netty/microbench 命名空间。在 buffer 命名空间中，您可以像运行任何其他基于 JUnit 的测试一样运行 ByteBufAllocatorBenchmark。主要区别在于（截至目前），您只能一次运行完整基准测试，而不能单独运行每个子基准测试。您应该在控制台中看到与直接通过 mvn 运行时看到相同的输出。

编写基准测试

编写基准测试本身并不难，但写好它却很难。这并不是因为 microbench 项目难以使用，而是因为您需要在编写基准测试时避免常见的陷阱。值得庆幸的是，JMH 套件提供了有用的注释和功能来减轻其中大部分陷阱。要开始编写基准测试，您需要让您的基准测试扩展 AbstractMicrobenchmark，这将确保测试通过 JUnit 运行并配置一些默认值

public class MyBenchmark extends AbstractMicrobenchmark {

}

下一步是创建一个使用 @GenerateMicroBenchmark 注释的方法（并为其提供一个描述性名称）

@GenerateMicroBenchmark
public void measureSomethingHere() {

}

现在最好的办法是查看此处，以获取有关如何编写适当的 JMH 测试的示例和灵感。此外，请查看 JMH 主要作者之一的演讲。

自定义运行时条件

默认设置（如 AbstractMicrobenchmark 中所述）为

预热迭代：10
测量迭代：10
分叉数：2

这些设置可以通过运行时的系统属性（warmupIterations、measureIterations 和 forks）进行自定义

mvn -DskipTests=false -DwarmupIterations=2 -DmeasureIterations=3 -Dforks=1 test

请注意，通常不建议使用这么少的迭代，但有时这有助于查看基准测试是否有效，然后在稍后运行全面的基准测试。

请注意，您还可以通过注释逐个测试地自定义这些默认设置

@Warmup(iterations = 20)
@Fork(1)
public class MyBenchmark extends AbstractMicrobenchmark {

}

这可以在逐个类和逐个方法（基准测试）的基础上进行。请注意，命令行参数始终会覆盖注释默认值。