Taogen's Blog

Stay hungry stay foolish.

介绍微信公众平台

微信公众平台是微信团队为微信公众号开发者提供的一个基础平台,平台通过为公众号开发者提供一系列的接口,让公众号能够更好的为微信用户提供服务。

微信公众号是微信 APP 中的一个功能,它是公众号运营者为微信用户提供资讯和服务的接入点,微信用户通过订阅公众号和进入公众号,可以看到公众号提供的服务菜单和推送消息。

介绍微信公众号开发

如何开发一个公众号

  • 开发者需要在微信公众平台注册一个公众号开发者账号。公众号分为服务号和订阅号,不同类型的公众号具备不同的公众平台的接口权限。
  • 在公众平台配置自己的域名和服务器,开发者的服务器与微信公众平台进行对接。
  • 开发者在微信公众平台提供的接口的基础上开发自己的服务,服务分为两种:公众号消息会话和公众号内网页。

公众号用户的识别

微信用户在公众号内的请求会携带一个 OpenID 的参数,来标识一个用户。每个用户对每个公众号都会产生一个唯一的 OpenID。如果需要在多个公众号之间做用户共通,需要将这些公众号和应用绑定到一个开发平台账号下,绑定后,一个用户对多个公众号依然有多个不同的 OpenID,但一个用户只有一个唯一的 UnionID。

公众号开发注意事项

  • 微信公众平台开发是指微信公众号进行业务开发,为移动应用、PC端网站、公众号第三方平台的开发,请前往微信开发平台接入。
  • 在申请到认证公众号之前,可以先通过测试号申请系统,快速申请一个接口测试号,立即开始接口测试开发。
  • 在开发过程中,可以使用接口调试工具来在线调试某些接口。
  • 每个接口都有每日接口调用频次限制,可以在公众平台官网-开发者中心处查看具体频次。
  • 在开发出现问题时,可以通过接口调用的返回码,以及报警排查指引(在微信公众平台官网-开发者中心处可以设置接口报警),开发发现和解决问题。
  • 公众平台以 access_token 为接口调用凭据,来调用接口,所有接口的调用需要先获取 access_token,access_token 在2小时内有效,过期需要重新获取,但1天内获取次数有限,开发者需自行存储。
  • 公众平台接口调用仅支持80端口。

公众号服务介绍

公众号消息会话

  1. 群发消息:公众号可以以一定频次(订阅号为每天1次,服务号为每月4次),向用户群发消息,包括文字消息、图文消息、图片、视频和语音等。
  2. 被动回复消息:在用户给公众号发消息后,公众号可以自动回复一个消息。
  3. 客服消息:在用户给公众号发消息后的48小时内,公众号可以给用户发送不限数量的消息,主要用于客服场景。
  4. 模板消息:用于给用户发送服务通知,如刷卡提醒,服务预约成功通知等。公众号可以用特定内容模板主动向用户发送。

公众号内网页

许多复杂的业务场景,需要通过网页形式来提供服务,如商城,水电缴费等。公众号网页可以通过点击公众号发送的消息,或者点击公众号菜单进入网页。网页需要用到的公众平台的功能:网页授权获取用户基本信息,微信JS-SDK。

微信公众号开发接入

微信公众号接入是指:部署自己的公众号应用程序到自己的服务器上,然后在微信公众平台配置自己的服务器信息,进行 token 验证,token 验证成功,说明微信公众平台能够正常地与你的服务器进行通信。

微信公众号开发接入过程如下:

  1. 编写公众号应用程序(含验证 token 的接口),服务器启动程序。
  2. 在微信公众平台网站填写服务器配置。在微信公众平台的“开发-基本配置-服务器配置-修改配置”页面,填写服务器的信息。
  3. 验证 token。点击“基本配置页面”下方的提交按钮,进行 token 验证。若当前页面出现“token验证失败”的提示,则说明配置的信息有误,或者你服务器的代码有误;若token验证成功,则会自动返回基本配置的主页面。
  4. 启用配置,准备开发。token 验证成功后,点击“启用”按钮,启用配置的服务器。然后使用微信公众平台提供的接口实现自己业务逻辑。

公众号接入成功后,可以进行微信公众平台 API 的调用测试。主要过程为:编写程序,调用程序,查看程序返回结果。

公众号开发基础

公众平台接口域名

微信公众平台有多个域名,开发者可以根据自己的服务器部署情况,选择最佳的接入点(延时更低,稳定性更高)。除此之外,开发者可以将其他接入点用作容灾用途,当网络链路发生故障时,可以考虑选择备用接入点来接入。微信公众平台的域名如下:

  1. 通用域名(api.weixin.qq.com),使用该域名将访问官方指定就近的接入点;
  2. 通用异地容灾域名(api2.weixin.qq.com),当上述域名不可访问时可改访问此域名;
  3. 上海域名(sh.api.weixin.qq.com),使用该域名将访问上海的接入点;
  4. 深圳域名(sz.api.weixin.qq.com),使用该域名将访问深圳的接入点;
  5. 香港域名(hk.api.weixin.qq.com),使用该域名将访问香港的接入点。

获取 Access token

调用微信公众平台接口之前,必须获取 access_token,access_token 时公众号的全局唯一调用凭据,公众号调用公众平台的接口时都需要使用 access_token 作为参数。

access_token 的存储至少需要保留 512 个字符空间,它的有效期为 2 个小时,需定时刷新,重复获取将导致上次获取的 access_token 失效。

access_token 使用说明:

  • 建议公众号开发者使用中控服务器统一获取和刷新access_token,其他业务逻辑服务器所使用的access_token均来自于该中控服务器,不应该各自去刷新,否则容易造成冲突,导致access_token覆盖而影响业务;
  • 目前access_token的有效期通过返回的expire_in来传达,目前是7200秒之内的值。中控服务器需要根据这个有效时间提前去刷新新access_token。在刷新过程中,中控服务器可对外继续输出的老access_token,此时公众平台后台会保证在5分钟内,新老access_token都可用,这保证了第三方业务的平滑过渡;

接口调用请求:

https请求方式: GET 
URL: https://api.weixin.qq.com/cgi-bin/token?grant_type=client_credential&appid=APPID&secret=APPSECRET

返回说明:

成功返回信息如下

{"access_token":"ACCESS_TOKEN","expires_in":7200}

IP 白名单

在 IP 白名单内的IP来源,获取 access_token 接口才可调用成功。

接口测试号申请

微信公众平台的某些高级接口的调用权限需要微信认证后才可以获取。为了帮助开发者快速了解和上手微信公众号开发,熟悉各个接口的调用,可以申请微信公众帐号测试号。

微信公众号测试号和你的正式的公众号不是同一个公众号,它们有不同的配置信息,如 appID, appsecret。微信公众号测试号相当于一个虚拟的公众号,可以让你调用很多接口,当你通过测试号实现了你的业务,最后,可以将你的配置参数改为正式的公众号。

测试号不需要设置 IP 白名单,使用测试号的配置信息去调用微信公众平台的接口是没有 IP 限制的。

链接地址:微信公众号接口测试帐号申请

微信公众平台接口在线调试工具

微信公众平台接口在线调试工具,是为了帮助开发者检测调用公众平台 API 时发送的请求参数是否正确。在“在线调试工具”的页面提交相关信息后,提交请求,可获得公众平台服务器的返回结果。

在线调试工具的主要功能是帮你构造参数,其实其它任何地方也可以调用微信公众平台的接口。

链接地址:微信公众平台接口调试工具

Web开发者工具

微信公众平台为开发者提供“web开发者工具”,用于帮助开发基于微信的网页或者webapp。它是一个桌面应用,通过模拟微信客户端的表现使得开发者可以使用这个工具方便地在 PC 上进行开发和调试。

下载地址:web开发者工具

微信公众平台的接口调用和微信公众号内的网页请求

微信公众号的网页,需要在微信公众号内或者在微信web开发者工具中打开。

微信公众平台的接口调用可以在任何支持 HTTP 请求的地方请求并得到返回结果。但“获取access_token 接口”的请求的客户端 IP 需要 IP 白名单中,才能成功返回。然而,调用其它公众平台接口需要 access_token 作为参数,近似相当于没有在 IP 白名单内的客户端请求无法调用微信公众平台接口。不在 IP白名单的客户端的请求结果如下:

https://api.weixin.qq.com/cgi-bin/token?grant_type=client_credential&appid=xxx&secret=xxx
{"errcode":40164,"errmsg":"invalid ip 58.213.199.30 ipv6 ::ffff:58.213.199.30, not in whitelist hint: [Zhnd8cQNe-4P2DPA]"}

微信公众平台的接口和功能

微信公众平台提供的接口和功能如下:

  • 自定义菜单
  • 消息管理
  • 微信网页开发
  • 素材管理
  • 图文消息留言管理
  • 用户管理
  • 帐号管理
  • 数据统计
  • 微信卡券
  • 微信门店
  • 智能接口
  • 微信设备功能
  • 新版客服功能
  • 对话能力(原导购助手)
  • 微信“一物一码”
  • 微信发票

我们可以简单地将微信公众平台的API接口分为三类:

  • 基础接口(菜单,用户,素材等),为消息服务和网页服务提供基础。
  • 消息服务接口。
  • 网页服务接口。

以上功能的具体的使用步骤和接口调用说明,请参考 微信公众号开发指南

微信网页服务开发

网页授权

网页授权是指:在用户访问公众号网页时获取该微信用户的信息需要先跳转到用户授权页面进行授权。

微信公众号应用需要通过微信用户的 openId 来标识用户,我们需要通过网页授权来获取用户信息,根据用户的 openId 知道是哪个用户访问了公众号网页,然后进行相应的业务处理。

网页授权的过程为:公众号页面重定向跳转到微信用户授权页面,微信回调你的服务器,你获得 code 参数,利用 code 参数获取 JSAPI access_token(与基础 API 的 access_token 不同),利用 access_token 去调用微信 API 获取用户基本信息。

网页授权的实现过程:

  • 在后端定义一个 Filter,拦截所有需要进行网页授权的请求(特定的 URI 前缀),检查请求的 session 中是否存在 openId。如果 openId 不存在,则需要进行网页授权,将未授权的请求重定向到微信授权页面;如果 openId 存在,则不进行任何处理。
  • 微信回调后,你获取到了 code 参数,然后利用 code 参数调用微信 API 获取 access_token 参数,然后利用 access_token 调用微信 API 获取用户基本信息。需要调用两次微信 API。
  • 将微信用户基本保存到 session 中。

JS-SDK使用说明

JS-SDK 功能的页面参考样式:https://www.weixinsxy.com/jssdk/

JS-SDK 接口主要的功能:监听分享事件,暂存音频和视频,调用微信APP的功能(如微信扫一扫、内置地图),微信支付等。

访问测试公众号的网页的条件:

  • 需要在微信开发者工具或者微信APP中访问公众号网页。
  • 需要扫码关注了测试公众号的用户才能访问该公众号网页。
  • 微信公众平台中配置”JS接口安全域名”(公众号调用 JSSDK 的域名),需要备案的域名,而不能是 IP。
  • 微信公众平台中配置“接口权限列表–网页授权获取用户基本信息–授权回调页面域名”(用户授权后回调公众号的域名),需要备案的域名才行。
  • “JS接口安全域名”与“授权回调页面域名”,需要保持一致。
  • (Note: 微信公众平台中配置“Token验证” 的 URL是用于“token 验证”(消息通信是否正常)和“微信公众号的消息服务开发”(消息服务通信的 URL 与 token 验证 的 URL 一致,但消息服务是 HTTP POST 请求),token 验证 URL 与公众号网页服务开发没有关系。消息服务的通信 URL 可以是 IP,可以与网页服务的两个域名不一致)

使用微信 JS-SDK 接口

调用微信 JS-SDK 的 API 接口需要开发者应用提供相关配置和签名参数给 JS-SDK 的配置接口。主要是4个参数:appId,timestamp,nonceStr 和 signature。

获取签名的实现过程:通过普通的 access_token 调用微信 API 获取 jsapi_ticket 参数,生成一个随机字符串(可使用UUID),根据签名算法,将需要4个参数转换为1个签名参数。

JS-SDK 的使用如下:

1)进行权限验证

wx.config({
debug: true, // 开启调试模式,调用的所有api的返回值会在客户端alert出来,若要查看传入的参数,可以在pc端打开,参数信息会通过log打出,仅在pc端时才会打印。
appId: '${wxJsapiConfig.appId}', // 必填,公众号的唯一标识
timestamp: '${wxJsapiConfig.timestamp}', // 必填,生成签名的时间戳
nonceStr: '${wxJsapiConfig.nonceStr}', // 必填,生成签名的随机串
signature: '${wxJsapiConfig.signature}',// 必填,签名
jsApiList: [
'updateAppMessageShareData',
'updateTimelineShareData'
] // 必填,需要使用的JS接口列表
});

2)权限验证成功后,会执行JS-SDK 的 wx.ready() 方法:

wx.ready(function(){
//自定义“分享给朋友”及“分享到QQ”按钮的分享内容
// Auto running
wx.updateAppMessageShareData({
title: 'wxPage-index', // 分享标题
desc: 'Test share to friend.', // 分享描述
link: '${wxJsapiConfig.url}', // 分享链接,该链接域名或路径必须与当前页面对应的公众号JS安全域名一致
imgUrl: '', // 分享图标
success: function () {
// 设置成功
console.log("Set share to friend info successfully!");
}
})

// 拍照或从手机相册中选图接口
// onclick running
document.querySelector('#chooseImage').onclick = function () {
wx.chooseImage({
count: 9, // 默认9
sizeType: ['original', 'compressed'], // 可以指定是原图还是压缩图,默认二者都有
sourceType: ['album', 'camera'], // 可以指定来源是相册还是相机,默认二者都有
success: function (res) {
localIds = res.localIds; // 返回选定照片的本地ID列表,localId可以作为img标签的src属性显示图片
console.log("chooseImage -- localIds first: " + localIds[0]);
}
});
};
});

3)权限验证失败后,会执行 JS-SDK 的 wx.error() 方法。

wx.error(function (res) {
alert(res.errMsg);
});

更多内容请参考官方说明:

微信公众号开发接入示例

下面以 Java 为例,基于 Spring Boot 框架进行示例展示。

1.创建和配置项目

创建一个 Maven 项目,项目名为 wechat-official-accounts

mvn archetype:generate -DgroupId=com.taogen.example -DartifactId=wechat-official-accounts -DarchetypeArtifactId=maven-archetype-webapp -DinteractiveMode=false

配置 pom.xml 文件

<project ...>  
...

<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<maven.compiler.target>1.8</maven.compiler.target>
<maven.compiler.source>1.8</maven.compiler.source>
<!-- custom properties -->
<project.java.version>1.8</project.java.version>
<junit.version>4.12</junit.version>
<log4j.version>2.8.2</log4j.version>
</properties>

<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.6.RELEASE</version>
</parent>

<dependencies>
<!-- start -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter</artifactId>
<exclusions>
<exclusion>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-logging</artifactId>
</exclusion>
</exclusions>
</dependency>
<!-- log4j2 -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-log4j2</artifactId>
</dependency>
<!-- spring web -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<!-- unit test -->
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-test</artifactId>
<scope>test</scope>
</dependency>
</dependencies>

<build>
<sourceDirectory>src/main/java</sourceDirectory>
<testSourceDirectory>src/test/java</testSourceDirectory>
<plugins>
<!-- Package as an executable jar/war. $ mvn package spring-boot:repackage -->
<plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<executions>
<execution>
<goals>
<goal>repackage</goal>
</goals>
</execution>
</executions>
</plugin>
<!-- maven compile -->
<plugin>
<artifactId>maven-compiler-plugin</artifactId>
<version>3.5.1</version>
<configuration>
<source>${project.java.version}</source>
<target>${project.java.version}</target>
</configuration>
</plugin>
</plugins>
</build>
</project>

配置 log4j2 日志

src/main/resources/log4j2.xml

<Configuration status="WARN">
<Appenders>
<Console name="Console" target="SYSTEM_OUT">
<PatternLayout pattern="%d [%t] %-5level %logger{36} - %msg%n"/>
</Console>
</Appenders>
<Loggers>
<Logger name="com.taogen.example" level="debug" additivity="false">
<AppenderRef ref="Console"/>
</Logger>
<Root level="error">
<AppenderRef ref="Console"/>
</Root>
</Loggers>
</Configuration>

配置 Spring Boot 参数和业务参数

src/main/resources/application.yml

wechat:
token: "my_token"
appId: "..."
appSecret: "..."
server:
port: 80

2.编写验证 token 的 Java 代码

src/main/java/com/taogen/example/App.java

@SpringBootApplication
@RestController
public class App {

private static final Logger logger = LogManager.getLogger();
@Value("${wechat.token}")
private String token;
@Value("${wechat.appId}")
private String appId;
@Value("${wechat.appSecret}")
private String appSecret;

public static void main(String[] args) {
SpringApplication.run(App.class, args);
}

@GetMapping("/hello")
public String hello() {
logger.debug("Access /hello");
return "hello " + new Date();
}

/**
* Validate Token
*
* @param signature 微信加密签名,signature结合了开发者填写的token参数和请求中的timestamp参数、nonce参数。
* @param timestamp 时间戳
* @param nonce 随机数
* @param echostr 随机字符串
* @return 若确认此次GET请求来自微信服务器,请原样返回echostr参数内容
*/
@GetMapping("/wx")
public String validateToken(@RequestParam("signature") String signature,
@RequestParam("timestamp") String timestamp,
@RequestParam("nonce") String nonce,
@RequestParam("echostr") String echostr) {
logger.debug("signature: {}, timestamp: {}, nonce: {}, echostr: {}",
signature, timestamp, nonce, echostr);
// 开发者通过检验signature对请求进行校验(下面有校验方式)。
// 1)将token、timestamp、nonce三个参数进行字典序排序
// 2)将三个参数字符串拼接成一个字符串进行sha1加密
// 3)开发者获得加密后的字符串可与signature对比,标识该请求来源于微信
String sortedParams = getSortedParams(timestamp, nonce);
String encryptedParams = getEncryptedParams(sortedParams);
if (encryptedParams.equals(signature)) {
return echostr;
} else {
return "error";
}
}

private String getSortedParams(String timestamp, String nonce) {
List<String> params = new ArrayList<>();
params.add(this.token);
params.add(timestamp);
params.add(nonce);
Collections.sort(params, Comparator.naturalOrder());
StringBuilder validateString = new StringBuilder();
for (String param : params) {
validateString.append(param);
}
return validateString.toString();
}

private String getEncryptedParams(String sortedParams) {
MessageDigest crypt = null;
try {
crypt = MessageDigest.getInstance("SHA-1");
crypt.reset();
crypt.update(sortedParams.toString().getBytes("UTF-8"));
} catch (NoSuchAlgorithmException | UnsupportedEncodingException e) {
e.printStackTrace();
}

return new BigInteger(1, crypt.digest()).toString(16);
}
}

3.打包项目,部署到服务器,运行项目

打包项目:

mvn package spring-boot:repackage

将项目打包文件上传到服务器,运行以下命令启动项目:

java -jar <project_name>.jar

4.在微信公众平台网站进行 token 验证

  • 访问微信公众平台,注册帐号或扫码登录。

  • 登录后,在左侧的菜单栏选择 “开发” - “基本配置”,点击进入基本配置页面。

  • 进入基本配置页面,在“服务器配置”右侧点击修改配置。填写以下配置,主要是 URL 和 你项目中自定义的 token:

    URL: http://<your_server_ip>/wx
    Token: my_token (项目中自主定义的 token)
    EncodingAESKey: (默认)
    消息加密方式:(默认)
  • 点击修改配置页面的“提交”按钮,若 token 验证成功会自动跳转到基本配置页面;若验证失败会在当前页面弹出错误提示。

Summary

微信基础 API 接口

前提要求

  • 客户端请求的 IP 需要在公众平台的 IP 白名单中。但测试公众号不需要。

不满足前提的错误结果

  • 调用微信 API 接口返回的错误提示:

    {"errcode":40164,"errmsg":"invalid ip <your_ip> ipv6 ::ffff:<your_ip>, not in whitelist hint: [jhCDbOwFE-C.kTAa] rid: 5f2baef3-06a383d9-48b6ee02"}

功能列表

  • 获取 access_token
  • 自定义菜单 (创建、删除、查询等)
  • 素材管理(临时素材、永久素材、图文素材的创建、删除和获取等)
  • 图文消息留言管理
  • 用户管理(用户标签、用户备注名、用户基本信息、用户地理位置等)
  • 帐号管理
  • 数据统计
  • 微信卡券

如何对接

  • 获取 access_token,调用微信 API。

微信消息服务

前提要求

  • token 验证通过。

不满足前提的错误结果

  • 无法收到微信服务器的请求

功能列表

  • 自定义菜单 -事件推送
  • 消息管理 (接收普通消息、接收事件推送、回复消息)
  • 消息管理(群发消息、模板消息、一次性订阅消息)

如何对接

  • 接收来自微信服务器的 token 验证请求,进行 token 验证。
  • 接收来自微信服务器的消息,解析 XML 获取消息内容和用户信息,回复 XML 格式的消息。
  • 获取 access_token,调用微信 API,群发消息给多个用户。

微信网页服务

前提要求

  • 公众号的网页需要在微信开发者工具或者微信APP中访问。
  • 公众号网页的 URL 的域名,JS 安全域名和用户授权回调域名需要一致,且域名已经备案。

不满足前提的错误结果

  • 网页出现下面的提示:

    Oops! Something went wrong:(

功能列表

  • 网页授权
  • 网页开发样式 WeUI
  • JS-SDK (基础接口,分享接口,图像接口,音频接口,智能接口,地理位置)
  • JS-SDK(微信支付)
  • 微信开放标签

如何对接

  • 在网页获取微信用户信息。公众号页面重定向跳转到微信用户授权页面,微信回调你的服务器,获得 code 参数,利用 code 参数获取 JSAPI access_token(与基础 API 的 access_token 不同),利用 access_token 去调用微信 API 获取用户基本信息。
  • 在网页调用微信 JS-SDK 的 API。调用微信 API 获取 jsapi_ticket,根据 jsapi_ticket 构造得到签名,将签名和相关参数返回给前端,前端利用签名和相关参数调用 JS-SDK API。

常见问题

Question: 公众号页面重定到微信用户授权页面(获取 code )时出现错误

微信授权页面错误提示为:Oops! Something went wrong:(

Solution:

确保在微信开发者工具或者微信APP 中打开微信公众号的页面链接,而不是在浏览器打开。

检查配置。微信公众平台的接口权限列表中的“网页授权获取用户信息”行,修改“授权回调页面域名”(不需要加 http:// or https://)

Question: JSAPI 参数错误,无效的签名。

错误提示为:{errMsg: "config:fail,Error: 系统错误,错误码:63002,invalid signature [20200804 16:04:22][]"}

Solution:

检查需要的4个参数(noncestr, jsapi_ticket, timestamp, url)是否不为空,是否正确。

检查签名算法是否正确。

检查访问的页面的 URL 要与传递的 URL 一致。注意传递的 url 是 https://xxx, 访问页面也要是 https://xxx,而不能是 http://xxx

Question: JS-SDK debug 模式没有弹出提示

Solution:

检查 JSSDK 的 js 引用是否正确。如,<script src="https://res.wx.qq.com/open/js/jweixin-1.6.0.js"></script>

查看微信开发者工具的 console,JavaScript 代码是否有语法错误。

Appendixes

开发者规范

接口权限说明

全局返回码说明

References

[1] 微信公众号开发指南 - DOC

[2] 微信公众平台

This post we are going to talk about NIO and NIO.2 in Java. The NIO was introduced in Java 5. The NIO is missing several features, which were subsequently provided by NIO.2.

Introduction

Input and output (I/O) facilities are fundamental parts of operating systems along with computer languages and their libraries.

Java initial suite of I/O APIs and related architecture are known as classic I/O. Because modern operating systems feature newer I/O paradigms, which class I/O doesn’t support, new I/O (NIO) was introduced as part of JDK 1.4 to support them. Because lack of time, Some other NIO features being deferred to JDK 5 and JDK 7.

Classic I/O

JDK 1.0 introduced I/O facilities for accessing the file system, accessing file content randomly, and streaming byte-oriented data between sources and destinations in a sequential manner.

NIO

Modern operating systems offer sophisticated I/O services (such as readiness selection) for improving I/O performance and simplifying I/O. Java Specification Request (JSR) 51 was created to address these capabilities.

JSR 51’s description indicates that it provides APIs for scalable I/O, fast buffered binary and character I/O, regular expressions, and charset conversion. Collectively, these APIs are known as NIO. JDK 1.4 implemented NIO in terms of the following APIs:

  • Buffers
  • Channels
  • Selectors
  • Regular expressions
  • Charsets

Buffers

Buffers are the foundation for NIO operations. Essentially, NIO is all about moving data into and out of buffers.

The process of a classic read operation like the following steps. The operating system issues a command to the disk controller to read a block of bytes from a disk into operating system buffer. Once this operation complete, the operating system copies the buffer contents to the buffer specified by the process when it issued a read() operation.

These are some inefficiency in classic I/O. Copying bytes from the operating system buffer to the process buffer isn’t very efficient. It would be more performance to have the DMA controller copy directly to the process buffer, but there are two problems with this approach:

  • The DMA controller typically cannot communicate directly with the user space in which the JVM process runs.
  • Block-oriented device such as a DMA controller work with fixed-size data blocks.

Because of these problems, the operating system acts as an intermediary, tearing apart and recombining data as if switches between the JVM process and the DMA controller.

The data assembly/disassembly tasks can be made more efficient by letting the JVM process pass a list of buffer addresses to the operating system in a single call. The operating system then fills or drains these buffers in sequence, scattering data to multiple buffers during a reading or gathering data from several buffers during a write operation. This scatter/gather activity reduces the number of system calls.

JDK 1.4’s java.nio.Buffer class abstracts the concept of a JVM process buffer.

Channels

Forcing a CPU to perform I/O tasks and wait for I/O completions is wasteful of this resource. Performance can be improved by offloading these tasks to DMA controllers so that the processor can get on with other work.

A channel serves as conduit for communicating (via the operating system) with a DMA controller to efficiently drain byte buffers to or fill byte buffers form a disk. JDK 1.4’s java.nio.channels.Channel interface, its subinterfaces, and various classes implement the channel architecture.

Selectors

I/O is classified as block-oriented or stream-oriented. Reading from or writing to a file is an example of block-oriented I/O. In contrast, reading from the keyboard or writing to a network connection is an example of stream-oriented I/O.

Stream I/O is often slower than block I/O. Many operating systems allow streams to be configured to operate in nonblocking mode in which a thread continually checks available input without blocking when no input is available. The thread can handle incoming data or perform other tasks until data arrives.

This “polling for available input” activity can be wasteful, especially when the thread needs to monitor many input streams. Modern operating systems can perform this checking efficiently, which is known as readiness selection, and which is often built on top of nonblocking mode. The operating system monitors a collection of streams and returns an indication to the thread of which streams are ready to perform I/O. As a result, a single thread can multiplex many active streams via common code and makes it possible.

JDK 1.4 supports readiness selection by providing selectors, which are instances of the java.nio.channels.Selector class that can examine one or more channels and determine which channels are ready for reading or writing.

NIO.2

JSR 51 specifies that NIO would introduce an improved file system interface, and to support asynchronous I/O and complete socket channel functionality. However, lack of time prevented these features from being included. JSR 203 was subsequently created to address these features, which introduced in JDK 7.

Improved File System Interface

The legacy File class suffers from various problems. For example, the renameTo() method doesn’t work consistently across operating systems. The new file system interface fixes these and other problems.

Asynchronous I/O

Nonblocking mode improves performance by preventing a thread that performs a read or write operation on a channel from blocking until input is available or output has been fully written. However, it doesn’t let an application determine if it can perform an operation without actually performing the operation. Nonblocking mode can’t separate code that checks for stream readiness form the data-processing code without making your code significantly complicated.

Asynchronous I/O overcomes this problem by letting the thread initiate the operation and immediately proceed to other work. The thread specifies some kind of callback function that is invoked when the operation finishes.

Completion of Socket Channel Functionality

JDK 1.4 added the DatagramChannel, ServerSocketChannel, and SocketChannel classes to the java.nio.channels package. However, lack of time prevented these classes form supporting binding and option configuration. Also, channel-based multicast datagrams were not supported. JDK 7 added binding support and option configuration to the aforementioned classes. Also, it introduced a new java.nio.channels.MulticastChannel interface.

Buffers

NIO is based on buffers, whose contents are sent to or received from I/O services via channels. This section introduces to NIO’s buffer classes.

Buffer Classes

A buffer is an object that stores a fixed amount of data to be sent to or received from an I/O service. It sits between an application and a channel that writes the buffered data to the service or reads the data from the service and deposits it into the buffer.

Buffers are not safe for use by multiple concurrent threads. If a buffer is to be used by more than one thread then access to the buffer should be controlled by appropriate synchronization.

Buffers have four basic properties:

  • Capacity. It is the number of elements it contains.
  • Limit. It is the index of the first element that should not be read or written.
  • Position. It is the index of the next element to be read or written. A buffer’s position is never negative and is never greater than its limit.
  • Mark. It’s the index to which its position will be reset when the reset method is invoked.
0 <= mark <= position <= limit <= capacity

The Hierarchy of Buffer Classes

A-java.nio.Buffer
|----A-ByteBuffer
|----A-CharBuffer
|----A-DoubleBuffer
|----A-FloatBuffer
|----A-IntBuffer
|----A-LongBuffer
|----A-ShortBuffer

Methods of Buffer

  • Abstract methods
    • abstract Object array()
    • abstract int arrayOffset()
    • abstract boolean hasArray()
    • abstract boolean isDirect()
    • abstract boolean isReadOnly()
  • Get Buffer’s Property
    • int capacity()
    • int limit()
    • int position()
    • boolean hasRemaining(). Tells whether there are any elements between the current position and the limit.
    • int remaining(). Returns the number of elements between the current position and the limit.
  • Set Buffer’s Property
    • Buffer limit(int newLimit). Sets this buffer’s limit.
    • Buffer position(int newPosition). Sets this buffer’s position.
    • Buffer mark(). Sets this buffer’s mark at its position.
    • Buffer reset(). Resets this buffer’s position to the previously-marked position.
  • Clearing, flipping, and rewinding
    • Buffer clear(). Clears this buffer. It makes a buffer ready for a new sequence of channel-read or relative put operations. The position is set to zero, the limit is set to the capacity, and the mark is discard. Invoke this method before using a sequence of channel-read or put operation to fill this buffer. For example, buf.clear(); in.read(buf);
    • Buffer flip(). Flips this buffer. It makes a buffer ready for a new sequence of channel-write or relative get operations. The limit is set to the current position, the position is set to zero. If the mark is defined then it is discarded. After a sequence of channel-read or put operations, invoke this method to prepare for a sequence of channel-write or relative get operations. For example, buf.put(magic); in.read(buf); buf.flip(); out.write(buf);. This method is often used in conjunction with the compact method when transferring data from one place to another.
    • Buffer rewind(). Rewinds this buffer. It makes a buffer ready for re-reading the data that it already contains. The position is set to zero and the mark is discarded. Invoke this method before a sequence of channel-write or get operations, assuming that the limit has already bean set appropriately. For example, out.write(buf); buf.rewind(); buf.get(array);

Buffers in Depth

Buffer Creation

ByteBuffer and the other primitive-type buffer classes declare various class methods for creating a buffer of that type.

Methods for creating ByteBuffer instances:

  • ByteBuffer allocate(int capacity)
  • ByteBuffer allocateDirect(int capacity)
  • ByteBuffer wrap(byte[] array)
  • ByteBuffer wrap(byte[] array, int offset, int length)

These methods show two ways to create a byte buffer: create the ByteBuffer object and allocate an internal array that stores capacity bytes or create the ByteBuffer object and use the specified array to store these bytes. For example:

ByteBuffer buffer = ByteBuffer.allocate(10);
byte[] bytes = new byte[200];
ByteBuffer buffer2 = ByteBuffer.wrap(bytes);

View Buffers

Buffers can manage data stored in other buffers. View buffers are create a buffer that manages another buffer’s data. View buffers share the same internal array. Each buffer has its own position, limit and mark.

ByteBuffer buffer = ByteBuffer.allocate(10);
ByteBuffer bufferView = buffer.duplicate();

Read-only view buffers

ByteBuffer buffer = ByteBuffer.allocate(10);
ByteBuffer bufferView = buffer.asReadOnlyBuffer();

Buffer Writing and Reading

Methods for ByteBuffer’s writing and reading

  • ByteBuffer put(byte b)
  • ByteBuffer put(int index, byte b)
  • Byte get()
  • Byte get(int index)

Flipping Buffers

buffer.flip() equals buffer.limit(buffer.position()).position(0). A example flipping buffer:

buffer.put(s);
buffer.flip();
while (buffer.hasRemaining()){
System.out.print(buffer.get());
}
buffer.clear();

Compact Buffers

The compact() method moves unwritten buffer data to the beginning of the buffer so that the next read() method call appends read data to the buffer’s data instead of overwriting that data. Do this in case of a partial write.

buf.clear();
while (in.read(buf) != -1){
buf.flip();
out.write(buf);
buf.compact();
}

Byte Ordering

Nonbyte primitive types except for Boolean are composed of several bytes. Each value of one of these multibyte types is stored in a sequence of contiguous memory locations. However, the order of these bytes can differ from operating system to operating system.

For example, consider 32-bit long integer 0x10203040. This value’s four bytes could be stored in memory (from low address to high address) as 10, 20, 30, 40; this arrangement is known as big endian order (the most-significant byte, the “big” end, is stored at the lowest address). Alternatively, these bytes could be stored as 40, 30, 20, 10; this arrangement is known as little endian order (the least-significant byte, the “little” end, is stored at the lowest address).

Java provides the java.nio.ByteOrder class to help you deal with byte-order issues when writing/reading multibyte value to/from a multibyte buffer. ByteOrder declares a ByteOrder nativeOrder() method that returns the operating system’s byte order as a ByteOrder instance. This instance is one of ByteOrder’s BIG_ENDIAN and LITTLE_ENDIAN constants.

  • static ByteOrder BIG_ENDIAN
  • static ByteOrder LITTLE_ENDIAN
  • ByteOrder nativeOrder()

ByteBuffer its default byte order is always big endian, even when the underlying operating system’s byte order is little endian. Because Java’s default byte order is also big endian, which lets classfiles and serialized objects store data consistently across Java virtual machines.

Big endian default can impact performance on little endian operating systems, ByteBuffer also declares a ByteBuffer order(ByteOrder bo) method to change the byte buffer’s byte order.

Although it may seem unusual to change the byte order of a byte buffer, this method is useful because ByteBuffer also declares several convenience methods for writing and reading multibyte values, such as ByteBuffer putInt(int value) and int getInt(). These convenience methods write these values according to the byte buffer’s current byte order.

Direct Byte Buffers

Operating systems can directly access the address space of a process. For example, an operating system could directly access a JVM process’s address space to perform a data transfer operation based on a byte array. However, a JVM might not store the array of bytes contiguously or its garbage collector might move the byte array to another location. Because of these limitations, direct byte buffer were created.

A direct byte buffer is a byte buffer that interacts with channels and native code to perform I/O. The direct byte buffer attempts to store byte elements in a memory area that a channel uses to perform direct (raw) access via native code that tells the operating system to drain or fill the memory area directly.

Direct byte buffers are the most efficient means performing I/O on the JVM. Although you can also pass non-direct byte buffers to channels, a performance problem might arise because non-direct byte buffers are not always to serve as the target of native I/O operations.

Although direct byte buffers are optimal for I/O, a direct byte buffer can be expensive to create because memory extraneous to the JVM’s heap will need to be allocated by the operating system, and setting up/tearing down this memory might take longer than when the buffer way located within the heap.

After your code is working and should you want to experiment with performance optimization, you can easily obtain a direct byte buffer by invoking ByteBuffer’s allocateDirect() method.

Channels

Channels partner with buffers to achieve high-performance I/O.

A Channel is an object that represents an open connection to a hardware device, a file, a network socket, an application component, or another entity that’s capable of performing writes, reads, and other I/O operations. Channels efficiently transfer data between byte buffers and operating system-based I/O service sources or destinations.

Channel classes

The Hierarchy of Channel Classes

I-Channel
|----I-ReadableByteChannel
|--------I-ScatteringByteChannel
|----I-WritableByteChannel
|--------I-GatheringByteChannel
|--------I-ByteChannel
|------------I-SeekableByteChannel
|----I-InterruptibleChannel
|--------A-AbstractInterruptibleChannel
|------------A-FileChannel
|------------A-SelectableChannel
|----------------A-AbstractSelectableChannel
|--------------------A-DatagramChannel
|--------------------A-ServerSocketChannel
|--------------------A-SocketChannel
|--------------------A-Pipe.SinkChannel
|--------------------A-Pipe.SourceChannel
|----I-AsynchronousChannel
|--------I-AsynchronousByteChannel
|--------A-AsynchronousFileChannel
|--------A-AsynchronousServerSocketChannel
|--------A-AsynchronousSocketChannel
|----I-NetworkChannel
|--------I-MulticastChannel

All channels are instances of classes that ultimately implement the java.nio.channels.Channel interface. The Chennel declares the following methods:

  • void close()
  • boolean isOpen()

To support I/O, channel is extended by the WritableByteChannel and ReadableByteChannel interface.

  • WritableByteChannel declares an abstract int write(ByteBuffer buffer) method that writes a sequence of bytes from buffer to the current channel.
  • ReadableByteChannel declares an abstract int read(ByteBuffer buffer) method that reads bytes from current channel into buffer.

A channel whose class implements only WritableByteChannel or ReadableByteChannel is unidirectional.

InterruptibleChannel interface describes a channel that can be asynchronous closed and interrupted.

NIO’s designers chose to shut down a channel when a blocked thread is interrupted because they couldn’t find a way to reliably handle interrupted I/O operations in the same manner across operating systems. The only way to guarantee deterministic behavior was to shut down the channel.

Obtain a channel

There are two ways to obtain a channel:

  • The java.nio.channels package provides a Channels utility class that offers two methods for obtaining channels from streams.
    • WritableByteChannel newChannel(OutputStream outputStream)
    • ReadableByteChannel newChannel(InputStream inputStream)
  • Various classic I/O classes have been retrofitted to support channel creation.
    • java.io.RandomAccessFile‘s FileChannel getChannel() method.
    • java.net.Socket‘s SocketChannel getChannel() method.

For example, obtain channels from standard I/O streams:

ReadableByteChannel src = Channels.newChannel(System.in);
WritableByteChannel dest = Channels.newChannel(System.out);
ByteBuffer buffer = ByteBuffer.allocateDirect(2048);
while (src.read(buffer) != -1){
buffer.flip();
dest.write(buffer);
buffer.compact();
}
buffer.flip();
while (buffer.hasRemaining()){
dest.write(buffer);
}

Channels in Depth

Scatter/Gather I/O

Channels provide the ability to perform a single I/O operation across multiple buffers. This capability is known as scatter/gather I/O (and is also known as vectored I/O).

In the context of a write operation, the contents of several buffers are gathered in sequence and then sent through the channel to a destination. In the context of a read operation, the contents of a channel are scattered to multiple buffers in sequence.

Modern operating systems provide APIs that support vectored I/O to eliminate (or at least reduce) system calls or buffer copies, and hence improve performance.

Java Provides the java.nio.channels.ScatteringByteChannel interface to support scattering and GatheringByteChannel interface to support gathering.

ScattheringByteChannel offers the following methods:

  • long read(ByteBuffer[] buffers, int offset, int length)
  • long read(ByteBuffer[] buffers)

GatheringByteChannel offers the following methods:

  • long write(ByteBuffer[] buffers, int offset, int length)
  • long write(ByteBuffer[] buffers)

File Channels

RandomAccessFile, FileInputStream, and FileOutputStream provide getChannel() method for returning a file channel instance, which describes an open connection to a file.

The abstract java.nio.channels.FileChannel class describes a file channel. This class implements the InterruptibleChannel, ByteChannel, GatheringByteChannel, and ScatteringByteChannel interfaces.

Unlike buffers, which are not thread-safe, file channels are thread-safe.

A file channel maintains a current position into the file, which FileChannel lets you obtain and change.

Methods of FileChannel:

  • void force(boolean metadata)
  • long position()
  • FileChannel position(long newPosition)
  • int read(ByteBuffer buffer)
  • int read(ByteBuffer dst, long position)
  • long size()
  • FileChannel truncate(long size)
  • int write(ByteBuffer buffer)
  • int write(ByteBuffer src, long position)

FileChannel objects support the concept of a current file position, which determines the location where the next data item will be read from or written to.

The following code is FileChannel usage example:

RandomAccessFile raf = new RandomAccessFile("temp.txt", "rw");
FileChannel fc = raf.getChannel();
long pos = fc.position();
System.out.println("Position: " + pos);
System.out.println("Size: " + fc.size());
String msg = "This is a test message.";
ByteBuffer buffer = ByteBuffer.allocateDirect(msg.length() * 2);
buffer.asCharBuffer().put(msg);
fc.write(buffer);
fc.force(true);
System.out.println("Position: " + fc.position());
System.out.println("Size: " + fc.size());
buffer.clear();
fc.position(pos);
fc.read(buffer);
buffer.flip();
while (buffer.hasRemaining()){
System.out.print(buffer.getChar());
}

Locking Files

The ability to lock all or part of a file was an important but missing feature from Java until Java 1.4 arrived. This capability lets a JVM process prevent other processes form accessing all or part of a file until it’s finished with the entire file or part of the file.

Although an entire file can be locked, it’s often desirable to lock a smaller region. For example, a database management system might lock individual table rows.

Locks that are associated with files are known as file locks. Each file lock starts at a certain byte position in the file and has a specific length (in bytes) from this position.

There are two kinds of file locks: exclusive and shared.

There are some important for file locking:

  • When an operating system doesn’t support shared locks, a shared lock request is quietly promoted to a request for an exclusive lock.
  • Locks are applied on a per-file basis. The are not applied on a per-thread or per-channel basis.

FileChannel declares four methods for obtaining exclusive and shared locks:

  • FileLock lock()
  • FileLock lock(long position, long size, boolean shared)
  • FileLock tryLock()
  • FileLock tryLock(long position, long size, boolean shared)

A FileLock instance is associated with a FileChannel instance but the file lock represented by the FileLock instance associates with the underlying file and not with the file channel. Without care, you can run into conflicts (and possibly even a deadlock) when you don’t release a file lock after you’re finished using it. The following code shows the FileLock usage:

FileLock lock = FileChannel.lock();
try {
// interact with the file channel
} catch (IOException ioe){
// handle the exception
} finally {
lock.release();
}

Mapping Files into Memory

FileChannel declares a map() method that lets you create a virtual memory mapping between a region of an open file and a java.nio.MappedByteBuffer instance that wraps itself around this region. This mapping mechanism offers an efficient way to access a file because no time-consuming system calls are needed to perform I/O. The map method is:

MappedByteBuffer map(FileChannel.MapMode mode, long position, long size)

FileChannel.MapMode enumerated type:

  • READ_ONLY
  • READ_WRITE
  • PRIVATE

Changes made to the resulting buffer will eventually be propagated to the file. They might not be made visible to other programs that have mapped the same file.

The specified mapping mode is constrained by the invoking FileChannel object’s access permissions. For example, if the file channel was opened as a read-only channel, and if you request READ_WRITE mode, map() will throw NonWritableChannelException.

Invoke MappedByteBuffer‘s isReadOnly() method to determine whether or not you can modify the mapped file.

The position and size parameters define the start and extent of the mapped region.

There is no unmap() method. Once a mapping is established, it remains until the MappedByteBuffer object is garbage collected (or the application exits).

The following code shows A MappedByteBuffer example:

RandomAccessFile raf = new RandomAccessFile("temp.txt", "rw");
FileChannel fc = raf.getChannel();
long size = fc.size();
System.out.println("Size: " + size);
MappedByteBuffer mbb = fc.map(FileChannel.MapMode.READ_WRITE, 0,
size);
mbb.clear();
while (mbb.remaining() > 0) {
System.out.print((char) mbb.get());
}
System.out.println();
System.out.println();

String msg = "hello";
mbb.clear();
mbb.asCharBuffer().put(msg);
while (mbb.hasRemaining()) {
fc.write(mbb);
}
fc.close();

Transferring Bytes Among Channels

To optimize the common practice of performing bulk transfers, two methods have bean added to FileChnnel that avoid the need for intermediate buffers:

  • long transferFrom(ReadableByteChannel src, long position, long count)
  • long transferTo(long position, long count, WritableByteChannel target)

There is a example of transferring between channels:

try (FileInputStream fis = new FileInputStream("temp.txt")){
FileChannel inChannel = fis.getChannel();
WritableByteChannel outChannel = Channels.newChannel(System.out);
inChannel.transferTo(0, inChannel.size(), outChannel);
} catch (IOException ioe){
System.out.println("I/O error: " + ioe.getMessage());
}

Socket Channels

Socket declares a SocketChannel getChannel() method for returning a socket channel instance, which describes an open connection to a socket. Unlike sockets, socket channels are selectable and can function in nonblocking mode. These capabilities enhance the scalability and flexibility of large applications.

Socket channels are describes by the java.nio.channels package’s abstract ServerSocketChannel, SocketChannel, and DatagramChannel classes. Each class ultimately extends SelectableChannle and InterruptibleChannel, making socket channels selectable and interruptible.

SelecableChannel offers the following methods to enable blocking or nonblocking:

  • SelectableChannel configureBlocking(boolean block)
  • boolean isBlocking()
  • Object blockingLock()

To enable nonblocking sockets:

ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.configureBlocking(false);

The blockingLock() method lets you prevent other threads from changing a socket channel’s blocking/nonblocking status.

ServerSocketChannel ssc = ServerSocketChannel.open();
SocketChannel sc = null;
Object lock = ssc.blockingLock();
synchronized(lock){
boolean blocking = ssc.isBlocking();
ssc.configureBlocking(false);
sc = ssc.accept();
ssc.configureBlocking(blocking)
}

Methods of ServerSocketChannel:

  • static ServerSocketChannel open(). Attempt to open a server-socket channel, which is initially unbound; it must be bound to a specific address via one of its peer socket’s bind() methods before connections can be accepted.
  • ServerSocket socket()
  • SocketChannel accept(). Accept the connection made this channel’s socket. If this channel is nonblocking, it immediately returns null when there are no pending connections or returns a socket channel that represents the connection. Otherwise, when the channel is blocking, it blocks.

An example of ServerSocketChannel as shown below:

ServerSocketChannel ssc = ServerSocketChannel.open();
ssc.socket().bind(new InetSocketAddress(9999));
ssc.configureBlocking(false);
String msg = "Local address: " + ssc.socket().getLocalSocketAddress();
ByteBuffer buffer = ByteBuffer.wrap(msg.getBytes());
while (true){
System.out.print(".");
SocketChannel sc = ssc.accept();
if (sc != null){
System.out.println();
System.out.println("Received connection from " +
sc.socket().getRemoteSocketAddress());
buffer.rewind();
sc.write(buffer);
sc.close();
} else {
try{
Thread.sleep(100);
}catch (InterruptedException ie){
assert false; // shouldn't happen
}
}
}

Methods of SocketChannel:

  • static SocketChannel open()
  • static SocketChannel open(InetSocketAddress remoteAddr)
  • Socket socket()
  • boolean connect(SocketAddress remoteAddr)
  • boolean isConnectionPending()
  • booean finishConnect()
  • boolean isConnected()
SocketChannel sc = SocketChannel.open();
sc.connect(new InetSocketAddress("localhost", 9999));
while (!sc.finishConnect()){
System.out.println("waiting to finish connection");
}
ByteBuffer buffer = ByteBuffer.allocate(200);
buffer.asCharBuffer().put("hello at " + new Date());
// send
sc.write(buffer);
// receive
while (sc.read(buffer) >= 0){
buffer.flip();
while (buffer.hasRemaining()){
System.out.print((char) buffer.get());
}
buffer.clear();
}
sc.close();

Pipes

Pipe describes a pair of channels that implement a unidirectional pipe, which is a conduit for passing data in one direction between two entities, such as two file channels or two socket channels. Pipe is analogous to the java.io.PipedInputStream and PiepedOutputStream.

Pipe declares nested SourceChannel and SinkChannel classes that serve as readable and writable byte channels, respectively. Pipe also declares the following methods:

  • static Pipe open()
  • SourceChannel source()
  • SinkChannel sink()

Pipe can be used to pass data within the same JVM. Pipes are ideal in producer/consumer scenarios because of encapsulation: you can use the same code to write data to files, sockets, or pipes depending on the kind of channel presented to the pipe.

The following code shows a producer/consumer example:

final int BUFSIZE = 10;
final int LIMIT = 3;
final Pipe pipe = Pipe.open();
Thread sender = new Thread(() -> {
WritableByteChannel src = pipe.sink();
ByteBuffer buffer = ByteBuffer.allocate(BUFSIZE);
for (int i = 0; i < LIMIT; i++) {
buffer.clear();
for (int j = 0; j < BUFSIZE; j++) {
double random = Math.random();
buffer.put((byte) (random * 256));
System.out.println("CLIENT Send: " + (byte) (random * 256));
}
buffer.flip();
try {
while (src.write(buffer) > 0) ;
} catch (IOException ioe) {
System.err.println(ioe.getMessage());
}
}
try {
src.close();
} catch (IOException ioe) {
ioe.printStackTrace();
}
});
Thread receiver = new Thread(() -> {
ReadableByteChannel dst = pipe.source();
ByteBuffer buffer = ByteBuffer.allocate(BUFSIZE);
try {
while (dst.read(buffer) >= 0) {
buffer.flip();
while (buffer.remaining() > 0) {
System.out.println("SERVER Receive: " + (buffer.get() & 255));
}
buffer.clear();
}
} catch (IOException ioe) {
System.err.println(ioe.getMessage());
}
});
sender.start();
receiver.start();

Selectors

Nonblocking mode doesn’t let an application determine if it can perform an operation without actually performing the operation.

The operating system is instructed to observe a group of streams and return some indication of which streams are ready to perform a specific operation (such as read). This capability lets a thread multiplex a potentially huge number of active streams by using the readiness information provided by the operating system.

Selectors let you achieve readiness selection in a Java context.

What is Selectors

A selector is an object created form a subclass of the abstract java.nio.channels.Selector class. The selector maintains a set of channels that it examines to determine which channels are ready for reading, writing, completing a connection sequence, accepting another connection, or some combination of these tasks. The actual work is delegated to the operating system via POSIX select() or similar system call.

How selectors work

Selectors are used with selectable channels, which are objects whose classes ultimately inherit from the abstract java.nio.channels.SelectableChannel class, which describes a channel that can be multiplexed by a selector.

One or more previously created selectable channels are registered with a selector. Each registration returns an instance of a subclass of the abstract SelectionKey class, which is a token signifying the relationship between one channel and the selector. This key keeps track of two sets of operations: interest set and ready set. The interest set identifies that will be tested for readiness the next time one of the selector’s selection methods is invoked. The ready set identifies the key’s channel has been found to be ready.

How to use selectors

Selector‘s methods

  • abstract void close()
  • abstract boolean isOpen()
  • abstract Set<SelectionKey> keys(). Returns this selector’s key set.
  • static Selector open(). Open a selector.
  • abstract SelectorProvider provider()
  • abstract int select(). Returns the number of channels that have become ready for I/O operations since the last time it was called.
  • abstract int select(long timeout). Set timeout for selector’s selection operation.
  • abstract Set<SelectionKey> selectedKeys(). Returns this selector’s selected-key set.
  • abstract int selectNow(). It’s a nonblocking version of select().
  • abstract Selector wakeup(). If another thread is currently blocked in an invocation of the select() or select(long) methods then that invocation will return immediately.

SelectableChannel‘s methods

  • SelectionKey register(Selector sel, int ops)
  • SelectionKey register(Selector sel, int ops, Object att). The third parameter att (a non-null object) is known as an attachment and is a convenient way of recognizing a given channel or attaching additional information to the channel. It’s stored in the SelectionKey instance returned from this method.

SelectionKey

  • int-based constants to ops
    • OP_ACCEPT. Operation-set bit for socket-accept opertions.
    • OP_CONNECT. Operation-set bit for socket-connect operations.
    • OP_READ. Operation-set bit for read operations.
    • OP_WRITE. Operation-set bit for write operations.
  • abstract SelectableChannel channel()

An application typically operations

  1. Performs a selection operation.
  2. Obtains the selected keys followed by an iterator over the selected keys.
  3. Iterates over these keys and performs channel operations (read, write).

A selection operation is performed by invoking one of Selector’s selection methods. It doesn’t return until at least one channel is selected, until this selector’s wakeup() method is invoked, or until the current thread is interrupted, whichever comes first.

A key presents a relationship between a selectable channel and a selector. This relationship can be terminated by invoking SelectionKey’s cancel() method.

When you’re finished with a selector, call Selector’s close() method. If a thread is currently blocked in one of this selector’s selection methods, it’s interrupted as if by invoking the selector’s wakeup() method. Any uncancelled keys still associated with this selector are invalidated, their channels are deregistered, and any other resources associated with this selector are released.

The following code shows a example of selector usage:

// get channels
ServerSocketChannel channel = ServerSocketChannel.open();
channel.configureBlocking(false);
// register channels to the selector
Selector selector = Selector.open();
SelectionKey key = channel.register(selector, SelectionKey.OP_ACCEPT |
SelectionKey.OP_READ |
SelectionKey.OP_WRITE);
// selection operation
while(true){
int numReadyChannels = selector.select();
if (numReadyChannels == 0){
contine;
}
Set<SelectionKey> selectedKeys = selector.selectedKeys();
Iterator<SelectionKey> keyIterator = selectedKeys.iterator();

while(keyIterator.hasNext){
SelectionKey key = keyIterator.next();
if (key.isAcceptable()){
ServerSocketChannel server = (ServerSocketChannel) key.channel();
SocketChannel client = server.accept();
if (client == null){
continue;
}
client.configureBlocking(false);
client.register(selector, SelectionKey.OP_READ);
} else if (key.isReadable()){
SocketChannel client = (SocketChannel) key.channel();
// Perform work on the socket channel...
} else if (key.isWritable()){
SocketChannel client = (SocketChannel) key.channel();
// Perform work on the socket channel...
}
}
}

Asynchronous I/O

NIO provides multiplexed I/O to facilitate the creation of highly scalable servers. Client code registers a socket channel with a selector to be notified when the channel is ready to start I/O.

NIO.2 provides asynchronous I/O, which lets client code initiate an I/O operation and subsequently notifies the client when then operation is complete. Like multiplexed I/O, asynchronous I/O is also commonly used to facilitate the creation of highly scalable servers.

The main concepts of asynchronous I/O are: asynchronous I/O overview, asynchronous file channels, asynchronous socket channels, and asynchronous channel groups.

Asynchronous I/O Overview

The java.nio.channels.AsynchronousChannel interface describes an asynchronous channel, which is a channel that supports asynchronous I/O operation (reads, writes, and so on).

An asynchronous I/O operation is initiated by calling a method that returns a future or requires a completion handler argument, Like

  • Future<V> operation(...). The Future methods may be called to check if the I/O operation has completed.
  • void opertion(... A attachment, CompletionHandler<V, ? super A> handler). The attachment is used for a stateless CompletionHandler object consume the result of many I/O operations. The handler is invoked to consume the result of the I/O operation when it completes or fails.

CompletionHandler declares the following methods to consume the result of an operation when it completes successfully, and to learn why the operation failed and take appropriate action:

  • void complted(V result, A attachment)
  • void failed(Throwable t, A attachment)

After an asynchronous I/O operation called, the method returns immediately. You then call Future methods or provide code in the CompletionHandler implementation to learn more about the I/O operation status and/or process the I/O operation’s results.

The java.nio.channels.AsynchronousByteChannel interface extends AsynchronousChannel. It offers the following four methods:

  • Future<Integer> read(ByteBuffer dst)
  • <A> void read(ByteBuffer dst, A attachment, CompletionHandler<Integer,? super A> handler)
  • Future<Integer> write(ByteBuffer src)
  • <A> void write(ByteBuffer src, A attachment, CompletionHandler<Integer,? super A> handler)

Asynchronous File Channels

The abstract java.nio.channels.AsynchronousFileChannel class describes an asynchronous channel for reading, writing, and manipulating a file.

This channel is created when a file is opened by invoking one of AsynchronousFileChannel’s open() methods.

The file contains a variable-length sequence of bytes that can be read and written, and whose current size can be queried.

Files are read and written by calling AsynchronousFileChannel’s read() and write() methods. One pair returns a Future and the other pair receives a CompletionHandler as an argument.

An asynchronous file channel doesn’t have a current position within the file. Instead, the file position is passed as an argument to each read() and write() method that initiates asynchronous operations.

AsynchronousFileChannel’s methods

  • asynchronous I/O operations
    • Future<FileLock> lock()
    • <A> void lock(A attachment, CompletionHandler<FileLock, ? super A> handler)
    • abstract Future<Integer> read(ByteBuffer dst, long position)
    • abstract <A> void read(ByteBuffer dst, long position, A attachment, CompletionHandler<Integer, ? super A> handler)
    • abstract Future<Integer> write(ByteBuffer src, long position)
    • abstract <A> void write(ByteBuffer src, long position, A attachment, CompletionHandler<Integer, ? super A> handler)
  • static AsynchronousFileChannel open(Path file, OpenOption... options)
  • static AsynchronousFileChannel open(Path file, Set options, ExecutorService executor, FileAttribute... attrs)
  • abstract void force(boolean metaData). Forces any updates to this channel’s file to be written to the storage device that contains it.
  • abstract AsynchronousFileChannel truncate(long size)
  • FileLock tryLock()
  • abstract FileLock tryLock(long position, long size, boolean shared)

The following code is a example of AsynchronousFileChannel:

public static void main(String[] args){
Path path = Paths.get(args[0]);
AsynchronousFileChannel ch = AsynchronousFileChannel.open(path);
ByteBuffer buf = ByteBuffer.allocate(1024);
Future<Integer> result = ch.read(buf, 0);
while (!result.isDone()){
System.out.println("Sleeping...");
Thread.sleep(500);
}
System.out.println("Finished = " + result.isDone());
System.out.println("Bytes erad = " + result.get());
ch.close();
}
public static void main(String[] args){
Path path = Paths.get(args[0]);
AsynchronousFileChannel ch = AsynchronousFileChannel.open(path);
ByteBuffer buf = ByteBuffer.allocate(1024);
Thread mainThd = Thread.currentThread();
ch.read(buf, 0, null,
new CompletionHandler<Integer, Void>(){
@Override
public void completed(Integer result, Void v){
System.out.println("Bytes read = " + result);
mainThd.interrupt();
}
@Override
public void failed(Throwable t, Void v){
System.out.println("Failure: " + t.toString());
mainThd.interrupt();
}
});
System.out.println("Waiting for completion");
try {
mainThd.join();
} catch (InterruptedException ie){
System.out.println("Terminating")
}
ch.close();
}

Asynchronous Socket Channels

The abstract java.nio.channels.AsynchronousServerSocketChannel class describes an asynchronous channel for stream-oriented listening sockets. Its counterpart channel for connecting sockets is described by the abstract java.nio.channels.AsynchronousSocketChannel class.

public class Server
{
private final static int PORT = 9090;
private final static String HOST = "localhost";

public static void main(String[] args)
{
AsynchronousServerSocketChannel channelServer;
try {
channelServer = AsynchronousServerSocketChannel.open();
channelServer.bind(new InetSocketAddress(HOST, PORT));
System.out.printf("Server listening at %s%n",
channelServer.getLocalAddress());
} catch (IOException ioe) {
System.err.println("Unable to open or bind server socket channel");
return;
}
Attachment att = new Attachment();
att.channelServer = channelServer;
channelServer.accept(att, new ConnectionHandler());
try {
Thread.currentThread().join();
} catch (InterruptedException ie) {
System.out.println("Server terminating");
}
}
}
public class Client
{
private final static Charset CSUTF8 = Charset.forName("UTF-8");
private final static int PORT = 9090;
private final static String HOST = "localhost";

public static void main(String[] args)
{
AsynchronousSocketChannel channel;
try {
channel = AsynchronousSocketChannel.open();
} catch (IOException ioe) {
System.err.println("Unable to open client socket channel");
return;
}

try {
channel.connect(new InetSocketAddress(HOST, PORT)).get();
System.out.printf("Client at %s connected%n",
channel.getLocalAddress());
} catch (ExecutionException | InterruptedException eie) {
System.err.println("Server not responding");
return;
} catch (IOException ioe) {
System.err.println("Unable to obtain client socket channel's local address");
return;
}
Attachment att = new Attachment();
att.channel = channel;
att.isReadMode = false;
att.buffer = ByteBuffer.allocate(2048);
att.mainThd = Thread.currentThread();
byte[] data = "Hello".getBytes(CSUTF8);
att.buffer.put(data);
att.buffer.flip();
channel.write(att.buffer, att, new ReadWriteHandler());
try{
att.mainThd.join();
} catch (InterruptedException ie)
{
System.out.println("Client terminating");
}
}
}

Asynchronous Channel Groups

The abstract java.nio.channels.AsynchronousChannelGroup class describes a grouping of asynchronous channels for the purpose of resource sharing. A group has an associated thread pool to which tasks are submitted, to handle I/O events and to dispatch to completion handlers that consume the results of asynchronous operations performed on the group’s channels.

AsynchronousServerSocketChannel and AsyynchronousSocketChannel belong to gourps. When you create an asynchronous socket channel via the no-argument open() method, the channel is bound to the default group.

AsynchronousFileChannel don’t belong to groups. However, they are associated with a thread pool which tasks are submitted, to handle I/O events and to dispatch to completion handlers that consume the results of I/O operations on the channel.

You can configure the default gourp by initializing the following system properties at JVM startup:

  • java.nio.channels.DefaultThreadPool.threadFactory
  • java.nio.channels.DefaultThreadPool.initialSize

You can define your own channel group. It gives you more control over the threads that are used to service the I/O operations. Furthermore, it provides the mechanisms to shut down threads and to await termination. You can use AsynchronousChannelGroup’s methods to create your own channel group:

  • static AsynchronousChannelGroup withCachedThreadPool(ExecutorService executor, int initialSize)
  • static AsynchronousChannelGroup withFixedThreadPool(int nThreads, ThreadFactory threadFactory)
  • static AsynchronousChannelGroup withThreadPool(ExecutorService executor)

Other methods of AsynchronousChannelGroup:

  • abstract boolean awaitTermination(long timeout, TimeUnit unit)
  • abstract boolean isShutdown()
  • abstract boolean isTerminated().
  • abstract void shutdown(). Initiates an orderly shutdown of the group.
  • abstract void shutdownNow(). Shuts down the group and closes all open channels in the group.
  • AsynchronousChannelProvider provider()

After creating a gourp, you can bind an asynchronous socket channel to the group by calling the following class methods:

  • AsynchronousServerSocketChannel’s open(AsynchronousChannelGroup group)
  • AsynchronousSocketChannel’s open(AsynchronousChannelGroup group)
AsynchronousChannelGroup group = AsynchronousChannelGroup.
withFixedThreadPool(20,Executors.defaultThreadFactory());
AsynchronousServerSocketChannel chServer =
AsynchronousServerSocketChannel.open(group);
AsynchronousSocketChannel chClient =
AsynchronousSocketChannel.open(group);

// After the operation has begun, the channel group is used to control the shutdown.
if (!group.isShutdown())
{
// After the group is shut down, no more channels can be bound to it.
group.shutdown();
}
if (!group.isTerminated())
{
// Forcibly shut down the group. The channel is closed and the
// accept operation aborts.
group.shutdownNow();
}
// The group should be able to terminate; wait for 10 seconds maximum.
group.awaitTermination(10, TimeUnit.SECONDS);

Summary

NIO is nonblocking and selectable, and it provides non-blocking IO and multiplexed I/O to facilitate the creation of highly scalable servers

NIO channels read() and write() operations are nonblocking.

Channels read and write operations are work with buffers. For example, write a buffer data to a channel (to a device), read data from a channel (from a device) into a buffer.

NIO channels are selectable it can be managed with a selector. A selector maintains a set of channels that it examines to determine which channels are ready for reading, writing, completing a connection sequence, accepting another connection, or some combination of these tasks. The actual work is delegated to the operating system via POSIX select() or similar system call.

NIO provides non-blocking IO and multiplex I/O, but it still needs to check the channels’ status with a loop. The NIO.2’s asynchronous IO can get notifies when the operation is complete. Like multiplexed I/O, asynchronous I/O is also commonly used to facilitate the creation of highly scalable servers.

All of IO, non-blocking IO, asynchronous IO are supported by the operating system.

References

[1] Java I/O, NIO and NIO.2 by Jeff Friesen

[2] Java SE 8 API

Understanding the Technique

  • Answer questions: what is it? How it works? What is the project mission?
  • Its structure and functionality of modules.
  • Use it. Get experience with the technology.

Prepared for Starting Reading Source code

  • Collecting questions.
  • Choosing a version of source code.
  • Start with a point of a question.

The Process of Reading Source Code

  • 对该技术有一些了解。最好用过它。能够理清它的目标使命,功能模块,核心功能。
  • 从问题入手。先准备好,看源码时你要解决的问题。
  • 通过不断地多次 debug 程序 (需要注意 debug 的技巧,快捷键跳转和看日志信息(DEBUG or TRACE))和记笔记(主要记录关键的源文件,核心方法,核心代码,关键变量的值,以及一些类的继承关系的层次结构)。最终,了解问题实现的整个大致流程。
  • 整理信息,画一个初步的时序图。
  • 了解完整的流程。继续不断地 debug 程序,画出一个详细的时序图。
  • 了解所有步骤的具体实现。一行一行代码去详细地了解问题的每一步的具体实现。
  • 用自己的话简单地总结这个问题的实现过程。

Methods of Reading Source Code

  • 洋葱阅读法。逐渐深入细节。
  • 用自己的话总结一段代码。
  • 使用 IDE 的快捷键进行代码跳转。使用独立窗口打开重要的源码文件。关键代码行添加到书签。
  • 使用比较早的比较简单的版本的源码。
  • 用 UML 图表示代码逻辑和结构,帮助你理解和记忆。
  • 修改和测试源码。
  • 打印该技术框架的 DEBUG 日志。

References

[1] How to read code without ripping your hair out

[2] What is the best way to read source code?

[3] BECOME A BETTER DEVELOPER BY READING SOURCE CODE

In this post, we are going to diving into the Spring framework in deep. The main content of this post is to analyze IoC implementation in the source code of the Spring framework.

Basics

Basic Concepts

Bean and BeanDefinition

In Spring, the objects that form the backbone of your application and that are managed by the Spring IoC container are called beans. A bean is an object that is instantiated, assembled, and otherwise managed by a Spring IoC container.

A Spring IoC container manages one or more beans. These beans are created with the configuration metadata that you supply to the container, for example, in the form of XML <bean/> definitions.

Within the container itself, these bean definitions are represented as BeanDefinition objects.

BeanWrapper

BeanWrapper is the central interface of Spring’s low-level JavaBeans infrastructure. Typically not used directly but ranther implicitly via a BeanFactory or a DataBinder. It provides operations to analyze and manipulate standard JavaBeans: get and set property values, get property descriptors, and query the readability/writability of properties.

IoC Containers

IoC container is the implementation of IoC functionality in the Spring framework. The interface org.springframework.context.ApplicationContextrepresents the Spring IoC container and is responsible for instantiating, configuring, and assembling the aforementioned beans. The container gets its instructions on what objects to instantiate, configure, and assemble by reading configuration metadata. The configuration metadata is represented in XML, Java annotations, or Java code. It allows you to express the objects that compose your application and the rich interdependencies between such objects.

Several implementations of the ApplicationContext interface are supplied out-of-the-box with Spring. In standalone applications it is common to create an instance of ClassPathXmlApplicationContext or FileSystemXmlApplicationContext.

BeanFactory vs ApplicationContext

Spring provides two kinds of IoC container, one is XMLBeanFactory and other is ApplicationContext. The ApplicationContext is a more powerful IoC container than BeanFactory.

Bean Factory

  • Bean instantiation/wiring

Application Context

  • Bean instantiation/wiring
  • Automatic BeanPostProcessor registration
  • Automatic BeanFactoryPostProcessor registration
  • Convenient MessageSource access (for i18n)
  • ApplicationEvent publication

WebApplicationContext vs ApplicationContext

WebApplicationContext interface provides configuration for a web application. This is read-only while the application is running, but may be reloaded if the implementation supports this.

WebApplcationContext extends ApplicationContext. This interface adds a getServletContext() method to the generic ApplicationContext interface, and defines a well-known application attribute name that the root context must be bound to in the bootstrap process.

WebApplicationContext in Spring is web aware ApplicationContext i.e it has Servlet Context information. In single web application there can be multiple WebApplicationContext. That means each DispatcherServlet associated with single WebApplicationContext.

Class Hierarchy

Bean Factories Hierarchy

I-BeanFactory
|----I-ListableBeanFactory
|--------C-StaticListableBeanFactory
|----I-HierarchicalBeanFactory
|--------I-ConfigurableBeanFactory
|------------A-AbstractBeanFactory
|------------I-ConfigurableListableBeanFactory
|----I-AutowireCappableBeanFactory
|---------A-AbstractAutowireCapableBeanFactory
|--------------C-DefaultListableBeanFactory
|-------------------C-XmlBeanFactory

ApplicationContext Hierarchy

I-ListableBeanFactory
I-HierarchicalBeanFactory
|----I-ApplicationContext
|--------I-WebApplicationContext
|--------I-ConfigurableApplicationContext
|------------A-AbstractApplicationContext
|------------I-ConfigurableWebApplicationContext
|----------------A-AbstractRefreshableWebApplicationContext
|--------------------C-XmlWebApplicationContext

AbstractApplicationContext Hierarchy

A-AbstractApplicationContext
|----A-AbstractRefreshableApplicationContext
|--------A-AbstractRefreshableConfigApplicationContext
|------------A-AbstractRefreshableWebApplicationContext
|----------------C-AnnotationConfigWebApplicationContext
|----------------C-XmlWebApplicationContext

BeanRegister & BeanFactory Hierarchy

I-SingletonBeanRegistry
|----C-DefaultSingletonBeanRegistry
|--------A-FactoryBeanRegistrySupport
|------------A-AbstractBeanFactory

DispatherServlet Hierarchy

A-javax.servlet.http.HttpServlet
|----A-HttpServletBean
|--------A-FrameworkServlet
|------------C-DispatcherServlet

The Process of IoC Implementation

The following figure shows the process of IoC implementation.

Next, We are going to analyze the implementation in the source code of every step.

1 Java web project starting

1.1 instantiate and init servlets from web.xml

When a Java web project running in an application server like Apache Tomcat, the servlet container of the server will load, instantiate, and init Java servlets that be <load-on-startup> from web.xml file.

<servlet>
<servlet-name>DispatcherServlet</servlet-name>
<servlet-class>org.springframework.web.servlet.DispatcherServlet</servlet-class>
<init-param>
<param-name>contextConfigLocation</param-name>
<param-value>classpath:applicationContext.xml</param-value>
</init-param>
<load-on-startup>1</load-on-startup>
</servlet>
<servlet-mapping>
<servlet-name>DispatcherServlet</servlet-name>
<url-pattern>/</url-pattern>
</servlet-mapping>

2 The DispatcherServlet initialization and Calling init() in HttpServletBean

2.1 set bean properties from init parameters

Set initial parameters specified in the web.xml file for the servlet instance DispatcherServlet.

3 initServletBean() in FrameworkServlet

3.1 initWebApplicationContext()

Initialize and publish the WebApplicationContext for this servlet.

  • Check is there any WebApplicationContext instance or root application context, if not create a new webApplicationContext instance (See #3.1.1).

3.1.1 createWebApplicationContext()

Using reflection to create a WebApplicationContext instance, and configure it (see #3.1.1.1).

FrameworkServlet.java

protected WebApplicationContext createWebApplicationContext(@Nullable ApplicationContext parent) {
//...
ConfigurableWebApplicationContext wac =
(ConfigurableWebApplicationContext) BeanUtils.instantiateClass(contextClass);
//...go to #3.1.1.1
configureAndRefreshWebApplicationContext(wac);
}

3.1.1.1 configureAndRefreshWebApplicationContext()

Set WebApplication instance properties and go to refresh (See #4).

FrameworkServlet.java

protected void configureAndRefreshWebApplicationContext(ConfigurableWebApplicationContext wac) {
//...
wac.setServletContext(getServletContext());
wac.setServletConfig(getServletConfig());
wac.setNamespace(getNamespace());
postProcessWebApplicationContext(wac);
applyInitializers(wac);
//...go to #4
wac.refresh();
}

4 refresh() in AbstractApplicationContext

As this is a startup method, it should destroy already created singletons if it fails, to avoid dangling resources. In other words, after invocation of that method, either all or no singletons at all should be instantiated.

AbstractApplicationContext.java

public void refresh() throws BeansException, IllegalStateException {
//...
// 1. Create a BeanFactory instance. Refer to #4.1
ConfigurableListableBeanFactory beanFactory = this.obtainFreshBeanFactory();
// 2. Configure properties of AbstractApplicationContext, prepare for bean initialization and context refresh
this.postProcessBeanFactory(beanFactory);
this.invokeBeanFactoryPostProcessors(beanFactory);
this.registerBeanPostProcessors(beanFactory);
this.initMessageSource();
this.initApplicationEventMulticaster();
this.onRefresh();
this.registerListeners();
// 3. bean initialization. Refer to #4.2
this.finishBeanFactoryInitialization(beanFactory);
// 4. context refresh.
this.finishRefresh();
// ...
}

4.1 obtainFreshBeanFactory()

Create a BeanFactory instance in AbstractRefreshableApplicationContext.

AbstractRefreshableApplicationContext.java

protected final void refreshBeanFactory() throws BeansException {
//...
// 1. Creating a BeanFactory instance.
DefaultListableBeanFactory beanFactory = this.createBeanFactory();
// 2. Set BeanFactory instance's properties
beanFactory.setSerializationId(this.getId());
this.customizeBeanFactory(beanFactory);
// 3. load bean definitions
this.loadBeanDefinitions(beanFactory);
// 4. assign this new BeanFactory instance to the AbstractRefreshableApplicationContext's field beanFactory
synchronized(this.beanFactoryMonitor) {
this.beanFactory = beanFactory;
}
//...
}

protected DefaultListableBeanFactory createBeanFactory() {
return new DefaultListableBeanFactory(this.getInternalParentBeanFactory());
}

4.2 finishBeanFactoryInitialization()

Initializate beanFactory, and go to instantiate bean instances (see #5).

AbstractApplicationContext.java

protected void finishBeanFactoryInitialization(ConfigurableListableBeanFactory beanFactory) {
// 1. initialize bean factory
//...
// 2. instantiate bean instances, go to #5
beanFactory.preInstantiateSingletons();
}

5 preInstantiateSingletons() in DefaultListableBeanFactory

Using Iterator of beanNames to get bean instances.

5.1 to create all bean instances

AbstractApplicationContext.java

public void preInstantiateSingletons() throws BeansException {
// 1. get beanNames' iterator
List<String> beanNames = new ArrayList(this.beanDefinitionNames);
Iterator var2 = beanNames.iterator();

// 2. for loop to create bean instances, go to #6
while (true){
//...
beanName = (String)var2.next();
if (this.isFactoryBean(beanName)) {
bean = this.getBean("&" + beanName);
break;
}
this.getBean(beanName);
//...
}
}

6 getBean(name) in AbstractBeanFactory

To get the bean instances.

AbstractBeanFactory.java

public Object getBean(String name) throws BeansException {
return this.doGetBean(name, (Class)null, (Object[])null, false);
}

6.1 doGetBean()

To get the bean instance by bean name.

AbstractBeanFactory.java

protected <T> T doGetBean(String name, @Nullable Class<T> requiredType, @Nullable Object[] args, boolean typeCheckOnly) throws BeansException {
String beanName = this.transformedBeanName(name);
Object sharedInstance = this.getSingleton(beanName);
Object bean;

// 1. check is the bean instance in creation
if (sharedInstance != null && args == null) {
//...
} else {
// 2. get and check beanDefinition by beanName
RootBeanDefinition mbd = this.getMergedLocalBeanDefinition(beanName);
this.checkMergedBeanDefinition(mbd, beanName, args);
// 3. to get different type bean instance: singleton, prototype, and so on.
if (mbd.isSingleton()) {
// to create singleton bean instance go to #7
sharedInstance = this.getSingleton(beanName, () -> {
try {
return this.createBean(beanName, mbd, args);
} catch (BeansException var5) {
this.destroySingleton(beanName);
throw var5;
}
});
bean = this.getObjectForBeanInstance(sharedInstance, name, beanName, mbd);
} else if (mbd.isPrototype()) {
//...
} else{
//...
}
}
}

7 getSingleton(beanName, singletonFactory) in DefaultSingletonBeanRegistry

To get singleton bean instance.

DefaultSingletonBeanRegistry.java

public Object getSingleton(String beanName, ObjectFactory<?> singletonFactory) {
// 1. check conditions and logging
//...
// 2. to get a singleton bean instance, go to #8
singletonObject = singletonFactory.getObject();
newSingleton = true;
// 3. add created singleton bean instance to the bean list
if (newSingleton) {
this.addSingleton(beanName, singletonObject);
}
}

protected void addSingleton(String beanName, Object singletonObject) {
synchronized(this.singletonObjects) {
// add bean instance to the bean list "singletonObjects"
this.singletonObjects.put(beanName, singletonObject);
this.singletonFactories.remove(beanName);
this.earlySingletonObjects.remove(beanName);
this.registeredSingletons.add(beanName);
}
}

8 createBean(beanName, beanDefinition, args) in AbstractAutowireCapableBeanFactory

To get a bean instance, create a beanWrapper, initialize the bean, add add the bean to the beanFactory.

AbstractAutowireCapableBeanFactory.java

protected Object createBean(String beanName, RootBeanDefinition mbd, @Nullable Object[] args) throws BeanCreationException {
// ...
// 1. construct a beanDefinition for use.
// ...
// 2. try to create bean instance
Object beanInstance = this.resolveBeforeInstantiation(beanName, mbdToUse);
if (beanInstance != null) {
return beanInstance;
}
// 3. try to create bean instance again if fail to create in first time
beanInstance = this.doCreateBean(beanName, mbdToUse, args);
return beanInstance;
}

protected Object doCreateBean(String beanName, RootBeanDefinition mbd, @Nullable Object[] args) throws BeanCreationException {
// 1. to construct a beanWrapper and create the bean instance
BeanWrapper instanceWrapper = this.createBeanInstance(beanName, mbd, args);
// 2. post processing
//...
// 3. Add bean to the beanFactory and some bean instances record list
this.addSingletonFactory(beanName, () -> {
return this.getEarlyBeanReference(beanName, mbd, bean);
});
// 4. autowired and handle property values
this.populateBean(beanName, mbd, instanceWrapper);
// 5. initialization bean
exposedObject = this.initializeBean(beanName, exposedObject, mbd);
// 6. return bean
return exposedObject;
}

protected BeanWrapper createBeanInstance(String beanName, RootBeanDefinition mbd, @Nullable Object[] args) {
// ...
ctors = mbd.getPreferredConstructors();
return ctors != null ? this.autowireConstructor(beanName, mbd, ctors, (Object[])null) : this.instantiateBean(beanName, mbd);
}

protected BeanWrapper instantiateBean(String beanName, RootBeanDefinition mbd) {
// 1. to create the bean instance
Object beanInstance = this.getInstantiationStrategy().instantiate(mbd, beanName, this);

// 2. create the BeanWrapper by bean instance
BeanWrapper bw = new BeanWrapperImpl(beanInstance);
this.initBeanWrapper(bw);
return bw;
}

9 instantiate() in SimpleInstantiationStrategy

Get the Constructor of the bean class and to get the bean instance.

SimpleInstantiationStrategy.java

public Object instantiate(RootBeanDefinition bd, @Nullable String beanName, BeanFactory owner) {
// 1. get the Constructor of the bean class
Constructor constructorToUse = (Constructor)bd.resolvedConstructorOrFactoryMethod;
Class<?> clazz = bd.getBeanClass();
constructorToUse = (Constructor)AccessController.doPrivileged(() -> {
return clazz.getDeclaredConstructor();
});
// 2. to craete bean instance
return BeanUtils.instantiateClass(constructorToUse, new Object[0]);
}

10 instantiateClass() in BeanUtils

Using reflection to create a instance.

BeanUtils.java

public static <T> T instantiateClass(Constructor<T> ctor, Object... args) throws BeanInstantiationException {
//...
// Using reflection to create instance
return ctor.newInstance(argsWithDefaultValues);
}

Conclusion

Spring framework uses the servlet container loads the DispatcherServlet when server startup to trigger the automatic beans instantiation.

The ApplicationContext is responsible for the whole process of beans instantiation. There is a BeanFactory as a field in the ApplicationContext, it is responsible to create bean instances.

The important steps of beans instantiation are: load bean definitions, create and initialize beanFactory, create and initialize bean instances, keep bean records in beanFactory.

References

[1] Spring Framework Source Code

[2] Spring Framework API - Current 5.2.7.RELEASE

[3] Spring Framework Documentation - 5.2.7.RELEASE

[4] How does spring Dispatcher Servlet create default beans without any XML configuration? - Stack Overflow

[5] Difference between ApplicationContext and WebApplicationContext in Spring MVC

[6] Spring life-cycle when we refresh the application context - Stack Overflow

[7] BeanFactory vs ApplicationContext - Stack Overflow

Many applications perform well with the default settings of a JVM, but some applications require additional JVM tuning to meet their performance requirement. A well-tuned JVM configurations used for one application may not be well suited for another application. As a result, understanding how to tune a JVM is a necessity.

Process of JVM Tuning

The following figure shows the process of JVM tuning.

The first thing of JVM tuning is prioritize application systemic requirements. In contrast to functional requirements, which indicate functionally what an application computes or produces for output, systemic requirements indicate a particular aspect of an application’s such as its throughput, response time, the amount of memory it consumes, startup time, availability, manageability, and so on.

Performance tuning a JVM involves many trade-offs. When you emphasize one systemic requirement, you usually sacrifice something in another. For example, minimizing memory footprint usually comes at the expense of throughput and/or latency. As you improve manageability, you sacrifice some level of availability of the application since running fewer JVMs puts a larger portion of an application at risk should there be an unexpected failure. Therefore, when emphasizing systemic requirements, it is crucial to the tuning process to understand which are most important to the application.

Once you know which systemic requirements are most important, the following steps of tuning process are:

  • Choose a JVM deployment mode.
  • Choose a JVM runtime environment.
  • Tuning the garbage collector to meet your application’s memory footprint, pause time/latency, and throughput requirements.

For some applications and their systemic requirements, it may take several iterations of this process until the application’s stakeholders are satisfied with the application’s performance.

Application Execution Assumptions

  • An initialization phase where the application initializes important data structures and other necessary artifacts to begin its intended use.
  • A steady state phase where the application spends most of its time and where the application’s core functionality is executed.
  • An optional summary phase where an artifact such as a report may be generated, such as that produced by executing a benchmark program just prior to the application ending its execution.

The steady state phase where an application spends most of its time is the phase of most interest.

Testing Infrastructure Requirements

The performance testing environment should be close to the production environment. The better the testing environment replicates the production environment running with a realistic production load, the more accurate and better informed tuning decision will be.

Application Systemic Requirements

There are several application systemic requirements, such as its throughput, response time, the amount of memory it consumes, its availability, its manageability, and so on.

Availability

Availability is a measure of the application being in an operational and usable state. An availability requirement expresses to what extent an application, or portions of an application, are available for use when some component breaks or experiences a failure.

In the context of a Java application, higher availability can be accomplished by running portions of an application across multiple JVMs or by multiple instances of the application in multiple JVMs. One of the trade-offs when emphasizing availability is increased manageability costs.

Manageability

Manageability is a measure of the operational costs associated with running and monitoring the application along with how easy it is configure the application. A manageability requirement expresses the ease with which the system can be managed. Configuration tends to be easier with fewer JVMs, but the application’s availability may be sacrificed.

Throughput

Throughput is a measure of the amount of work that can be performed per unit time. A throughput requirement ignores latency or responsiveness. Usually, increased throughput comes at the expense of either an crease in latency and/or an increase in memory footprint.

Latency and Responsiveness

Latency, or responsiveness, is a measure of the elapsed time between when an application receives a stimulus to do some work and that work is completed. A latency or responsiveness requirement ignores throughput. Usually, increased responsiveness or lower latency, comes at the expense of lower throughput and/or an increase in memory footprint.

Memory Footprint

Memory footprint is a measure of the amount of memory required to run an application at a some level of throughput, some level of latency, and/or some level of availability and manageability. Memory footprint is usually expressed as either the amount of Java heap required to run the application and/or the total amount of memory required to run the application.

Startup Time

Startup time is a measure of the amount of time it takes for an application to initialize. The time it takes to initialize a Java application is dependent on many factors including but not limited to the number of classes loaded, the number of objects that require initialization, how those objects are initialized, and the choice of a HotSpot VM runtime, that is, client or server.

Rank Systemic Requirements

The first step in the tuning process is prioritizing the application’s systemic requirements. Doing so involves getting the major application stakeholders together and agreeing upon the prioritization.

Ranking the systemic requirements in order of importance to the application stakeholders is critical to the tuning process. The most important systemic requirements drive some of the initial decisions.

Choose JVM Deployment Model

There is not necessarily a best JVM deployment model. The most appropriate choice depends on which systemic requirements (manageability, availability, etc.) are most important.

Generally, with JVM deployment models has been the fewer the JVMs the better. With fewer JVMs, there are fewer JVMs to monitor and manage along with less total memory footprint.

Choose JVM Runtime

Choosing a JVM runtime for a Java application is about making a choice between a runtime environment that tends to be better suited for one or another of client and server applications.

There are several runtime environments to consider: client or server runtime, 32-bit or 64-bit JVM, and garbage collectors.

Client or Server Runtime

There are three types of JVM runtime to choose from when using the HotSpot VM: client, server, or tiered.

  • The client runtime is specialized for rapid startup, small memory footprint, and a JIT compiler with rapid code generation.
  • The server runtime offers more sophisticated code generation optimizations, which are more desirable in server applications. Many of the optimizations found in the server runtime’s JIT compiler require additional time to gather more information about the behavior of the program and to produce better performing generated code.
  • The tiered runtime which combines the best of the client and server runtimes, that is, rapid startup time and high performing generated code.

If you do not know which runtime to initially choose, start with the server runtime. If startup time or memory footprint requirements cannot be met and you are using Java 6 Update 25 or later, try the tiered server runtime. If you are not running Java 6 Update 25 or later, or the tiered server runtime is unable to meet your startup time or memory footprint requirement, switch to the client runtime.

32-Bit or b4-Bit JVM

There is a choice between 32-bit and 64-bit JVMs.

The following table provides some guidelines for making an initial decision on whether to start with a 32-bit or 64-bit JVM. Note that client runtimes are not available in 64-bit HotSpot VMs.

Operating System Java Heap Size 32-Bit or 64-Bit JVM
Windows Less than 1300 megabytes 32-bit
Windows Between 1500 megabytes and 32 gigabytes 64-bit with -d64 -XX:+UseCompressedOops command line options
Windows More than 32 gigabytes 64-bit with -d64 command line option
Linux Less than 2 gigabytes 32-bit
Linux Between 2 and 32 gigabytes 64-bit with -d64 -XX:+UseCompressedOops command line options
Linux More than 32 gigabytes 64-bit with -d64 command line option
Oracle Solaris Less than 3 gigabytes 32-bit
Oracle Solaris Between 3 and 32 gigabytes 64-bit with -d64 -XX:+UseCompressedOops command line options
Oracle Solaris More than 32 gigabytes 64-bit with -d64 command line option

Garbage Collectors

There are several garbage collectors are available in the HotSpot VM: serial, throughput, mostly concurrent, and garbage first.

Since it is possible for applications to meet their pause time requirements with the throughput garbage collector, start with the throughput garbage collector and migrate to the concurrent garbage collector if necessary. If migration to the concurrent garbage collector is required, it will happen later in the tuning process as part of determine and tune application latency step.

The throughput garbage collector is specified by the HotSpot VM command line option -XX:+UseParallelOldGC or -XX:+UseParallelGC. The difference between the two is that the -XX:+UseParallelOldGC is both a multithreaded young generation garbage collector and a multithreaded old generation garbage collector. -XX:+UseParallelGC enables only a multithreaded young generation garbage collector.

GC Tuning Fundamentals

GC tuning fundamentals contain the following content: attributes of garbage collection performance, GC tuning principles, and GC logging. Understanding the important trade-offs among the attributes, the tuning principles, and what information to collect is crucial to JVM tuning.

The Performance Attributes

  • Throughput.
  • Latency.
  • Footprint.

A performance improvement for one of these attributes almost always is at the expense of one or both of the other attributes.

The Principles

There are three fundamental principle to tuning GC:

  • The minor GC reclamation principle. At each minor garbage collection, maximize the number of objects reclaimed. Adhering to this principle helps reduce the number and frequency of full garbage collections experienced by the application. Full GC typically have the longest duration and as a result are applications not meeting their latency or throughput requirements.
  • The GC maximize memory principle. The more memory made available to garbage collector, that is, the larger the Java heap space, the better the garbage collector and application perform when it comes to throughput and latency.
  • The 2 of 3 GC tuning principle. Tune the JVM’s garbage collector for two of the three performance attributes: throughput, latency, and footprint.

Keeping these three principles in mind makes the task of meeting your application’s performance requirements much easier.

Command Line Options and GC Logging

The JVM tuning decisions made utilize metrics observed from monitoring garbage collections. Collecting this information in garbage collection logs is the best approach. This garbage collection statistics gathering must be enabled via HotSpot VM command line options. Enabling GC logging, even in production systems, is a good idea. It has minimal overhead and provides a wealth of information that can be used to correlate application level events with garbage collection or JVM level events.

Several HotSpot VM command line options are of interest for garbage collection logging. The following is the minimal set of recommended command line options to use:

-XX:+PrintGCTimeStamps -XX:+PrintGCDetails -Xloggc:<filename>
  • -XX:+PrintGCTimeStamps prints a time stamp representing the number of seconds since the HotSpot VM was launched until the garbage collection occurred.
  • -XX:+PrintGCDetails provides garbage collector statistics.
  • -Xloggc:<filename> directs the garbage collection information to the file.

Others useful options:

  • To see a calendar date and time of day rather than a time stamp
    • -XX:+PrintGCDateStamps
  • Tuning for low response time/latency
    • -XX:+PrintGCApplicationStoppedTime
    • -XX:+PrintGCApplicationConcurrentTime
    • -XX:+PrintSafepointStatistics

Here is example output using the preceding three garbage collection logging command line options.

45.152: [GC
[PSYoungGen: 295648K->32968K(306432K)]
296198K->33518K(1006848K), 0.1083183 secs]
[Times: user=1.83 sys=0.01, real=0.11 secs]
  • In the first line of GC logging, 46.152 is the number of seconds since the JVM launched.
  • In the second line, PSYoungGen means a young generation garbage collection. 295648K->32968K(306432K) means the occupancy of the young generation space prior and after to the garbage collection. The value inside the parentheses (306432K) is the size of the young generation space, that is, the total size of eden and the two survivor spaces.
  • In the third line, 296198K->33518K(1006848K) means the Java heap utilization before and after the garbage collection. The value inside the parentheses (1006848K) is the total size of the Java heap.
  • In the fourth line, [Times: user=1.83 sys=0.01, real=0.11 secs] provides CPU and elapsed time information. 1.83 seconds of CPU time in user mode, 0.01 seconds of operating system CPU time, 0.11 seconds is the elapsed wall clock time in seconds of the garbage collection. In this example, it took 0.11 seconds to complete the garbage collection.

Determine Memory Footprint

Up to this point in the tuning process, no measurements have been taken. Only some initial choices have been made. This step of tuning process provides a good estimate of the amount of memory or Java heap size required to run the application. The outcome of this step identifies the live data size for the application. The live data size provides input into a good starting point for a Java heap size configuration to run the application.

Live data size is the amount of memory in the Java heap consumed by the set of long-lived objects required to run the application in its steady state. In other words, it is the Java heap occupancy after a full garbage collection while the application is in steady state.

Constraints

The following list helps determine how much physical memory can be made available to the JVM(s):

  • Will the Java application be deployed in a single JVM on a machine where it is the only application running? If that is the case, then all the physical memory on the machine can be made available to JVM.
  • Will the Java application be deployed in multiple JVMs on the same machine? or will the machine be shared by other processes or other Java application? If either case applies, then you must decide how much physical memory will be made available to each process and JVM.

In either of the preceding scenarios, some memory must be reserved for the operating system.

HotSpot VM Heap Layout

Before taking some footprint measurements, it is important to have an understanding of the HotSpot VM Java heap layout. The HotSpot VM has three major spaces: young generation, old generation, and permanent generation. The heap data area composes of the young generation and the old generation.

The new objects are allocated in the young generation space. Objects that survive after some number of minor garbage collections are promoted into the old generation space. The permanent generation space holds VM and Java class metadata as well as interned Strings and class static variables.

Following command line options specify the size of area data space:

  • Heap area
    • -Xmx specified the maximum size of heap.
    • -Xms specified the initial size of heap.
  • Young generation
    • -XX:NewSize=<n>[g|m|k] specified the initial and minimum size of the young generation space.
    • -XX:MaxNewSize=<n>[g|m|k] specified the maximum size of the young generation space.
    • -Xmn<n>[g|m|k] sets the initial, minimum, and maximum size of the young generation space.
  • Permanent generation
    • -XX:PermSize=<n>[g|m|k] specified the initial and minimum size of permanent generation space.
    • -XX:MaxPermSize=<n>[g|m|k] specified the maximum size of permanent space.

If any Java heap size, initial or maximum size, young generation size, or permanent generation size is not specified, the HotSpot VM automatically choose values based on the system configuration it discovers through an adaptive tuning feature called ergonomics.

It is important to understand that a garbage collection occurs when any one of the three spaces, young generation, old generation, or permanent generation, is in a state where it can no longer satisfy an allocation event.

When the young generation space does not have enough available to satisfy a Java object allocation, the HotSpot VM performs a minor GC to free up space. Minor GC tend to be short in duration relative to full GC.

When the old generation space no longer has available space for promoted objects, the HotSpot VM performs a full garbage collection. It actually performs a full garbage collection when it determines there is not enough available space for object promotions from next minor garbage collection.

A full garbage collection also occurs when the permanent generation space does not have enough available space to store additional VM or class metadata.

In a full GC, both old generation, permanent generation and young generation are garbage collected. -XX:-ScavengeBeforeFullGC will disable young generation space garbage collection on full garbage collections.

Heap Size Starting Point

To begin the heap size tuning process, a starting point is needed. The approach may start with a larger Java heap size than is necessary to run the Java application. The purpose of this step is to gather some initial data and further refine the heap size to more reasonable values.

For choose JVM runtime, start with the throughput garbage collector. The throughput garbage collector is specified with the -XX:+UseParallelOldGC command lien option.

If you have a good sense of amount of the Java heap space the Java application will required, you can use that Java heap size as a starting point. If you do not know what Java heap size the Java application will require, you can start with the Java heap size the HotSpot VM automatically chooses.

If you observe OutOfMemoryErrors in the garbage collection logs while attempting to put the application into its steady state, take notice of whether the old generation space ro the permanent generation space is running out of memory. The following example illustrates an OutOfMemoryError occurring as a result of a too small old generation space:

2010-11-25T18:51:03.895-0600: [Full GC
[PSYoungGen: 279700K->267300K(358400K)]
[ParOldGen: 685165K->685165K(685170K)]
964865K->964865K(1043570K)
[PSPermGen: 32390K->32390K(65536K)], 0.2499342 secs]
[Times: user=0.08 sys=0.00, real=0.05 secs]
Exception in thread "main" java.lang.OutOfMemoryError: Java heap space

The following example shows an OutOfMemoryError occurring as a result of a too small permanent generation space:

2010-11-25T18:26:37.755-0600: [Full GC
[PSYoungGen: 0K->0K(141632K)]
[ParOldGen: 132538K->132538K(350208K)]
32538K->32538K(491840K)
[PSPermGen: 65536K->65536K(65536K)], 0.2430136 secs]
[Times: user=0.37 sys=0.00, real=0.24 secs]
java.lang.OutOfMemoryError: PermGen space

If you observe an OutOfmemoryError in the garbage collection logs, try increasing the Java heap size to 80% to 90% of the physical memory you have available for the JVM.

Keep in mind the limitations of Java heap size based on the hardware platform and whether you are using a 32-bit or 64-bit JVM.

After increasing the Java heap size check the garbage collection logs for OutOfMemoryError. Repeat these steps, increasing the Java heap size at each iteration, until you observe no OutOfMemoryErrors in the garbage collection logs.

Once the application is running in its stead state without experiencing OutOfMemoryErrors, the next step is to calculate the application’s live data size.

Calculate Live Data Size

The live data size is the Java heap size consumed by the set of long-lived objects required to run the application in its steady state. In other words, the live data size is the Java heap occupancy of the old generation space and permanent space after a full garbage collection while the application is running in its steady state.

The live data size for a java application can be collected from the garage collection logs. The live data size provides the following tuning information

  • An approximation of the amount of old generation Java heap occupancy consumed while running the application in steady state.
  • An approximation of the amount of permanent generation heap occupancy consumed while running the application in steady state.

To get a good measure of an application’s live data size, it is best to look at the Java heap occupancy after several full garbage collections. Make sure these full garbage collections are occurring while the application is running in its steady state.

If the application is not experiencing full garbage collections, or they are not occurring very frequently, you can induce full garbage collections using the JVM monitoring tools VisualVM or JConsole. To force full garbage collections, monitor the application with VisualVM or JConsole and click the Perform GC button. A command line alternative to force a full garbage collection is use the HotSpot JDK distribution jmap command.

$ jmap -histo:live <process_id>

The JVM process id can be acquired using the JDK’s jps command line.

Initial Heap Space Size Configuration

This section describes how to use live data size calculations to determine an initial Java heap size. It is wise to compute an average of the Java heap occupancy and garbage collection duration of several full garbage collections for your live data size calculation.

The general sizing rules shows below figure.

Space Command Line Option Occupancy Factor
Java heap -Xms and -Xmx 3x to 4x old generation space occupancy after full GC
Permanent Generation or Metaspace -XX:PermSize
-XX:MaxPermSize
1.2x to 1.5x permanent generation space occupancy after full GC
Young Generation or New -Xmn 1x to 1.5x old generation space occupancy after full garbage collection
Old Generation Implied from overall Java heap size minus the young generation size 2x to 3x old generation space occupancy after full garbage collection.

Additional Considerations

It is important to know that the Java heap size calculated in the previous section does not represent the full memory footprint of a Java application. A better way to determine a Java application’s total memory use is by monitoring the application with an operating system tool such as prstat on Oracle Solaris, top on Linux, and Task Manager on Windows. The Java heap size may not be the largest contributor to an application’s memory footprint. For example, applications may require additional memory for thread stacks. The larger the number of threads, the more memory consumed in threads stacks. The deeper the method calls executed by the application, the larger the thread stacks. The memory footprint estimate for an application must include any additional memory use.

The Java heap sizes calculated in this step are a starting point. These sizes may be further modified in the remaining steps of the tuning process, depending on the application’s requirements.

Tune Latency/Responsiveness

This step begins by looking at the latency impact of the garbage collector starting with the initial Java heap size established in the last step “Determine memory footprint”.

The following activities are involved in evaluating the garbage collector’s impact on latency:

  • Measuring minor garbage collection duration.
  • Measuring minor garbage collection frequency.
  • Measuring worst case full garbage collection duration.
  • Measuring worst case full garbage collection frequency.

Measuring garbage collection duration and frequency is crucial to refining the Java heap size configuration. Minor garbage collection duration and frequency measurements drive the refinement of the young generation size. Measuring the worst case full garbage collection duration and frequency drive old generating sizing decisions and the decision of whether to switch from using the throughput garbage collector -XX:+UseParallelOldGC to using the concurrent garbage collector -XX:+UseConcMarkSweepGC.

Inputs

There are several inputs to this step of the tuning process. They are derived from the systemic requirements for the application.

  • The acceptable average pause time target for the application.
  • The frequency of minor garbage collection included latencies that are considered acceptable.
  • The maximum pause time incurred by the application that can be tolerated by the application’s stakeholders.
  • The frequency of the maximum pause time that is considered acceptable by the application’s stakeholders.

Tune Application Throughput

This is the final step of the turning process. The main input into this step is the application’s throughput performance requirements. An application’s throughput is something measured at the application level, not at the JVM level. Thus, the application must report some kind of throughput metric, or some kind of throughput metric must be derived from the operations it is performing. When the observed application throughput meets or exceeds the throughput requirements, you are finished with the tuning process. If you need additional application throughput to meet the throughput requirements, then you have some additional JVM tuning work to do.

Another important input into this step is the amount of memory that can be made available to the deployed Java application. As the GC maximize Memory Principle says, the more memory that can be made available for the Java heap, the better the performance.

It is possible that the application’s throughput requirements cannot be met. In that case, the application’s throughput requirements must be revisited.

Summary

There are some important summaries are shown below:

  • You should follow the process of JVM tuning.
  • Your requirements determine what to tuning. There are many trade-offs, you must determine what are most important requirements.
  • The main requirements for JVM tuning are memory footprint, latency, and throughput.

References

[1] Java Performance by Charlie Hunt

This post is talking about automatic and adaptive memory management in the Java runtime. It mainly includes the following content: concepts of memory management, how a garbage collector works, memory management in the JRockit VM.

Concepts of Automatic Memory Management

Automatic memory management is defined as any garbage collection technique that automatically gets rid of stale references, making a free operator unnecessary.

There are many implementation of automatic memory management techniques, such as reference counting, heap management strategies, tracing techniques (traversing live object graphs).

Adaptive memory management

JVM behavior on runtime feedback is a good idea. JRockit was the first JVM to recognize that adaptive optimization based on runtime feedback could be applied to all subsystems in the runtime and not just to code generation. One of these subsystems is memory management.

Adaptive memory management is based heavily on runtime feedback. It is a special case of automatic memory management.

Adaptive memory management must correctly utilize runtime feedback for optimal performance. This means changing GC strategies, automatic heap resizing, getting rid of memory fragmentation at the right intervals, or mainly recognizing when it is most appropriate to “stop the world”. Stopping the world means halting the executing Java program, which is a necessary evil for parts of a garbage collection cycle.

Advantages of automatic memory management

  • It contributes to the speed of the software development cycle. Because using automatic memory management, memory allocation bugs can’t occur and buffer overruns can’t occur.
  • An adaptive memory manager may pick the appropriate garbage collection strategy for an application based on its current behavior, appropriately changing the number of garbage collecting threads or fine tuning other aspects of garbage collection strategies whenever needed.

Disadvantages of automatic memory management

  • It can slow down execution for certain applications to such an extent that it becomes impractical.
  • There may still be memory leaks in a garbage collected environment.

Fundamental Heap Management

Before knowing actual algorithms for garbage collection, we need to know the allocation and deallocation of objects. We also need to know which specific objects on the heap to garbage collect, and how they get there and how they are removed.

Allocating and releasing objects

Allocation on a per-object, in the common case, never takes place directly on the heap. Rather, it is performed in the thread local buffers or similar constructs that promoted to the heap from time to time. However, in the end, allocation is still about finding appropriate space on the heap for the newly allocated objects or collections of objects.

In order to put allocated objects on the heap, the memory management system must keep track of which section of the heap are free. Free heap space is usually managed by maintaining a free list – a linked list of the free memory chunks on the heap, prioritized in some order that makes sense.

A best fit or first fit can be performed on the free list in order to find a heap address where enough free space is available for the object. There are many different allocation algorithms for this, with different advantages.

Fragmentation and compaction

It is not enough to just keep track of free space in a useful manner. Fragmentation is also an issue for the memory manager. When dead objects are garbage collected all over the heap, we end up with a lot of holes from where objects have been removed.

Fragmentation is a serious scalability issue for garbage collection, as we can have a very large amount of free space on the heap that, even though it is free, is actually unusable.

If the memory manager can’t find enough contiguous free space for the new object, an OutOfMemoryError will be thrown, even though there are much free space on the heap.

Thus, a memory management system needs some logic for moving objects around on the heap, in order to create larger contiguous free regions. This process is called compaction, and involves a separate GC stage where the heap is defragmented by moving live objects so that they are next to one another on the heap.

Compaction is difficult to do without stopping the world. By looking at the object reference graph and by gambling that objects referencing each other are likely to be accessed in sequence, the compaction algorithm may move these objects so that they are next to one another on the heap. This is beneficial for the cache, the object lifetimes are similar so that larger free heap holes are crated upon reclamation.

Garbage collection algorithms

All techniques for automatic memory management boil down to keeping track of which objects are being used by the running program, in other words, which objects are referenced by other objects that are also in use. Objects that are no longer in use may be garbage collected. Objects in use also means live objects.

There are two category common garbage collection techniques: reference counting and tracing garbage collection.

Reference counting

Reference counting is a garbage collection algorithm where the runtime keeps track of how many live objects point to a particular object at a given time.

When the reference count for an object decreases to zero, the object has no referrers left, the object is available for garbage collection.

Advantages of reference counting algorithm

  • It is obvious simplicity, is that any unreferenced object may be reclaimed immediately when its reference count drops to zero.

Disadvantages of reference counting algorithm

  • The obvious flaw that cyclic constructs can never be garbage collected. Consequently cause a memory leak.
  • Keeping the reference counts up to date can be expensive, especially in a parallel environment where synchronization is required.

There are no commercial Java implementations today where reference counting is a main garbage collection technique in the JVM, but it might well be used by subsystems and for simple protocols in the application layer.

Tracing techniques

A tracing garbage collection start by marking all objects currently seen by the running program as live. The recursively mark all objects reachable from those objects live as well. There are many variations of tracing techniques, for example, “mark and sweep” and “stop and copy”.

There are some basic concepts of tracing techniques, for example, root set that means the initial input set for this kind of search algorithm, that is the set of live objects from which the trace will start. The root set contains all objects that are available without having to trace any references.

Mark and sweep

The mark and sweep algorithm is the basis of all the garbage collectors in all commercial JVMs today. Mark and sweep can be done with or without copying or moving objects. However, the real challenge is turning it into an efficient and highly scalable algorithm for memory management. The following pseudocode describes a naive mark and sweep algorithm:

Mark:
Add each object in the root set to a queue
For each object X in the "queue"
Mark X reachable
Add all objects referenced from X to the queue
Sweep:
For each object X on the "heap"
If the X not marked, garbage collect it

The following figures show the process of mark and sweep:

  1. Before mark
  1. After mark.
  1. After sweep

In naive mark and sweep implementations, a mark bit is typically associated with each reachable object. The mark bit keeps track of if the object has been marked or not.

A variant of mark and sweep that parallelizes better is tri-coloring mark and sweep. Basically, instead of using just one binary mark bit per object, a color, or ternary value is used. There are three color: white, grey, and black.

  • White objects are considered dead and should be garbage collected.
  • Black objects are guaranteed to have no references to white objects.
  • Grey objects are live, but with the status of their children unknown.

Initially, there are no black object – the marking algorithm needs to find them, and the root set is colored grey to make the algorithm explore the entire reachable object graph. All other objects start out as white.

The tri-color algorithm is fairly simple:

Mark:
All objects are White by default.
Color all objects in the root set Grey.
While there exist Grey objects
For all Grey object, X
For all white objects (successors) Y, that X references Color Y Grey.
If all edges from X lead to another Grey object, Color X Black.
Sweep:
Garbage collect all White objects

The main idea here is that as long as the invariant that no block nodes ever point to white nodes is maintained, the garbage collector can continue marking even while changes to the object graph take place.

Stop and Copy

Stop and copy can be seen as a special case of tracing GC, and is intelligent in its way, but is impractical for large heap sizes in real applications.

Stop and copy garbage collection partitioning the heap into two region of equal size. Only one region is in use at a time. Stop and copy garbage collection goes through all live objects in one of the heap regions, starting at the root set, following the root set pointers to other objects and so on. The marked live objects are moved to the other heap region. After garbage collection, the heap regions are switched so that other half of the heap becomes the active region before the next collection cycle.

Advantages of stop and copy algorithm:

  • fragmentation can’t become an issue.

Disadvantages of stop and copy algorithm:

  • All live objects must be copied each time a garbage collection is performed, introducing a serious overhead in GC time as a function of the amount of live data.
  • Only using half of heap at a time is a waste of memory.

The following figure shows the process of stop and copy:

Generational garbage collection

In object-oriented languages, most objects are temporary or short-lived. However, performance improvement for handling short-lived objects on the heap can be had if the heap is split into two or more parts called generations.

In generational garbage collection, new objects are allocated in “young” generations of the heap, that typically are orders of magnitude smaller than the “old” generation, the main part of the heap. Garbage collection is then split into young and old collections, a young collection merely sweeping the young spaces of the heap, removing dead objects and promoting surviving objects by moving them to an older generation.

Collecting a smaller young space is orders of magnitude faster than collecting the larger old space. Even though young collections need to happen far more frequently, this is more efficient because many objects die young and never need to be promoted. ideally, total throughput is increased and some potential fragmentation is removed.

JRockit refer to the young generations as nurseries.

Muti generation nurseries

While generational GCs typically default to using just one nursery, sometimes it can be a good idea to keep several small nursery partitions in the heap and gradually age young objects, moving them from the “younger” nurseries to the “older” ones before finally promoting them to the “old” part of heap. This stands in contrast with the normal case that usually involves just one nursery and one old space.

Multi generation nurseries may be more useful in situations where heavy object allocation takes place.

If young generation objects live just a bit longer, typically if they survive a first nursery collection, the standard behavior of a single generation nursery collector, would cause these objects to be promoted to the old space. There, they will contribute more to fragmentation when they are garbage collected. So it might make sense to have several young generations on the heap, with different age spans for young objects in different nurseries, to try to keep the heap holes away from the old space where they do the most damage.

Of course the benefits of a multi-generational nursery must be balanced against the overhead of copying objects multiple times.

Write barriers

In generational GC, objects may reference other objects located in different generations of the heap. For example, objects in the old space may point to objects in the young spaces and vice versa. If we had to handle updates to all references from the old space to the young space on GC by traversing the entire old space, no performance would be gained from the generational approach. As the whole point of generational garbage collection is only to have to go over a small heap segment, further assistance from the code generator is required.

In generational GC, most JVMs use a mechanism called write barriers to keep track of which parts of the heap need to be traversed. Every time an object A starts to reference another object B, by means of B being placed in one of A’s fields or arrays, write barriers are needed.

The traditional approach to implementing write barriers is to divide the heap into a number of small consecutive sections (typically about 512 bytes each) that are called cards. The address space of the heap is thus mapped to a more coarse grained card table. whenever the Java program writes to a field in an object, the card on the heap where the object resides is “dirtied” by having the write barrier code set a dirty bit.

Using the write barriers, the traversal time problem for references from the old generation to the nursery is shortened. When doing a nursery collection, the GC only has to check the portions of the old space represented by dirty cards.

Throughput versus low latency

Garbage collection requires stopping the world, halting all program execution, at some stage. Performing GC and executing Java code concurrently requires a lot more bookkeeping and thus, the total time spent in GC will be longer. If we only care about throughput, stopping the world isn’t an issue–just halt everything and use all CUPs to garbage collect. However, to most applications, latency is the main problem, and latency is caused by not spending every available cycle executing Java code.

Thus, the tradeoff in memory management is between maximizing throughput and maintaining low latencies. But, we can’t expect to have both.

Garbage collection in JRockit

The backbone of the GC algorithm used in JRockit is based on the tri-coloring mark and sweep algorithm. For nursery collections, heavily modified variants of stop and copy are used.

Garbage collection in JRockit can work with or without generations, depending on what we are optimizing for.

Summary

  • First, we need to know what is automatic memory management, its advantages and disadvantages.
  • Fundamentals of heap management are: how to allocating and releasing objects on the heap, how to compact fragmentations of heap memory.
  • There are two common garbage collection algorithms: reference counting and tracing garbage collection. The reference counting is simple, but have some serious drawbacks. There are many variations of tracing techniques: “mark and sweep” and “stop and copy”. The “stop and copy” technique has some drawbacks, and it only uses in some special scenarios. Therefore, the mark and sweep is the most popular garbage collection algorithm use in JVMs.
  • The real-world garbage collector implementation is called generational garbage collection that is based on the “mark and sweep” algorithm.

References

[1] Oracle JRockit: The Definitive Guide by Marcus Hirt, Marcus Lagergren

The Java Virtual Machine defines various run-time data areas that are used during execution of a program. Some of these data areas are created on Java Virtual Machine start-up and are destroyed only when the Java Virtual Machine exits. Other data areas are per thread. Per-thread data areas are created when a thread is created and destroyed when the thread exits. The data areas of JVM are logical memory spaces, and they may be not contiguous physical memory space. The following image shows the JVM run-time data areas:

The pc Register

The Java Virtual Machine can support many threads of execution at once. Each JVM thread has its own pc (program counter) register. At any point, each JVM thread is executing the code of a single method, namely the current method for that thread. If that method is not native, the pc register contains the address of the JVM instruction currently being executed. If the method is native, the value of the pc register is undefined.

Java Virtual Machine Stacks

Each JVM thread has a private JVM stack, created at the same time as the thread. A JVM stack stores frames. The JVM stack holds local variables and partial results, and plays a part in method invocation and return.

Each frame in a stack stores the current method’s local variable array, operand stack, and constant pool reference. A JVM stack may have many frames because until any method of a thread is finished it may call many other methods, and frames of these methods also store in the same JVM stack.

Because the JVM stack is never manipulated directly except to push and pop frames, frames may be heap allocated. The memory for a JVM stack does not need to be contiguous.

Frames

A frame is used to store data and partial results, as well as to perform dynamic link, return values for method, and dispatch exceptions.

A new frame is created each time a method is invoked. A frame is destroyed when its method invocation completes, whether that completion is normal or abrupt. Frames are allocated from the JVM stack of the thread creating the frame. Each frame has its own array of local variables, operand stack, and a reference to the run-time constant pool of the class of the current method.

Only one frame for the executing method, is active at any point in a given thread of control. This frame is referred to as the current frame, and its method is known as the current method. The class in which the current method is defined is the current class. Operations on local variables and the operand stack are typically with reference to the current frame.

A frame ceases to be current if its method invokes another method or if its method completes. When a method is invoked, a new frame is created and becomes current when control transfers to the new method. On method return, the current frame passes back the result of its method invocation, to the previous frame.

A frame created by a thread is local to that thread and connot be referenced by any other thread.

Heap

The JVM has a heap that is shared among all JVM threads. The heap is the run-time data area from which memory for all class instances and arrays is allocated.

The heap is created on virtual machine start-up. Heap storage for objects is reclaimed by an automatic storage management system (known as a garbage collector); objects are never explicitly deallocated. The JVM assumes no particular type of automatic storage management system, and the storage management technique may be chosen according to the implementor’s system requirements. The memory for the heap does not need to be contiguous.

Method Area

The JVM has a method area that is shared among all JVM threads. The method area is analogous to the storage area for compiled code of conventional language or analogous to the “text” segment in an operating system process. It stores per-class structures such as the run-time constant poll, field and method data, and the code for methods and constructors, including the special methods used in class and instance initialization and interface initialization.

The Method area is created on virtual machine start-up. Although the method area is logically part of the heap, simple implementations may choose not to either garbage collect or compact it. The method area may be of a fixed size or may be expanded as required. The memory for the method area does not need to be contiguous.

Run-Time Constant Pool

A run-time constant pool is a per-class or per-interface run-time representation of the constant_pool table in a class file. It contains several kinds of constant, ranging from numeric literals known at compile-time to method and field references that must be resolved at run-time. The run-time constant pool serves a function similar to that of a symbol table for a conventional programming language, although it contains a wider range of data than a typical symbol table.

Each run-time constant pool is allocated from the JVM’s method area. The run-time constant pool for a class or interface is constructed when the class or interface is created by the JVM.

References constant pools

  • A symbolic reference to a class or interface is in CONSTANT_Class_info structure.
  • A symbolic reference to a field of a class are in CONSTANT_Fieldref_info structure.
  • A symbolic reference to a method of a class is derived from a CONSTANT_Methodref_info structure.
  • A symbolic reference to a method of an interface is derived from a CONSTANT_InterfaceMethodref_info structure.

Literals constant pools

  • A string literal is in CONSTANT_String_info structure
  • Constant values are in CONSTANT_Integer_info, CONSTANT_Float_info, CONSTANT_Long_info, or CONSTANT_Double_info structures.

Native Method Stacks

JVM may use native method stacks to support native methods. The native methods are written in a language other than the Java programming language. JVM implementations that cannot load native methods and that do not themselves rely on conventional stacks need not supply native method stack. If supplied, native method stacks are typical allocated per thread when each thread is created.

How does A Program’s Running Represent in JVM Memory Area

  • Program start with the main method.
  • Before execute the main(). 1) JVM creates a thread and allocates pc register, JVM stack, native method stack run-time memory area for the main thread. 2) JVM loading the class of main() and initializes the class static resources (that stored in constant pool). 3) create a new frame for main() method and construct the frame’s structure. 4) update the pc register to refer to the first line of main().
  • JVM executing pc register referred instruction of the main thread.
  • When we need to create an instance of a class in the main() method. 1) JVM loads the calling class, and initializes its static resources 2) JVM creates a new object in heap, and initializes its members, and constructs the object.
  • When we calling a method of the new created object in the main(). 1) The JVM create a new frame in JVM stack for the calling method of that object, and that frame becomes current frame. 2) update pc register to refer to that method. 3) JVM executing pc register referred instruction.

Conclusion

Per-Thread Data Areas

  • The pc Register. It contains the address of the JVM instruction currently being executed.
  • Java Virtual Machine Stacks. It holds local variables and partial results.
  • Native Method Stacks. To support native methods.

Threads-Shared Data Areas

  • Heap. It contains all class instances and arrays.
  • Method Area. It’s part of heap. It stores per-class structures such as run-time constant poll, field and method data, and the code for methods and constructors.
  • Run-Time Constant Pool. It’s part of method area. It contains several kinds of constant, literals and reference constant.

References

[1] The Java Virtual Machine Specification

[2] JVM Stacks and Stack Frames

我在2019年的7月和8月之间大概花了两个多星期写了一个比较简单的爬虫项目,项目主要的功能是周期性的爬取一些网站的热点信息,然后把数据存在 Redis 缓存中,然后有一个可以切换不同网站和展示信息的页面。

为什么会写这个项目呢?写这个项目的原因是在 V2EX 网站上看到很多人在分享他自己做的热点聚合网站,由于,我当时属于离职状态,好久没写代码了,于是决定也写一个类似的项目练练手。于是就写了 hot-crawler 这个项目,项目的源码放在了 GitHub 上。

写完了之后,就购买了域名和云服务器,然后就部署上去了。但是,我发现网站会偶尔访问不了,无响应。发现是服务器的程序进程挂掉了,于是重新启动项目后,我在服务器的控制面板观察服务器的资源状态,看到系统占用内存会不断的升高,所以,我猜应该是内存不足导致进程挂掉了,可是不知道为什么会导致内存泄漏。由于当时已经在网上推广了自己的网站,事情比较紧急,想着用什么方法来解决呢?于是我先是把应用程序作为 Linux 自启的系统服务,这样程序挂掉之后会自动重新启动,不至于一直访问不了。然后,我又通过加一台服务器使用 Nginx 进行负载均衡。因为程序挂掉是需要经过一段时间的,然后需要十几秒的时间重启,在相同的十几秒时间,两台服务器同时挂掉的机率比较小,通过负载均衡可以错开这十几秒,从而使得网站可以持续访问。

网站挂掉的问题解决掉了,但不是真正意义上的解决。我也想去分析程序为什么会出现泄漏,但我不知从何开始。也许可能需要学习很多 JVM 相关的知识,由于当时我的学习计划是先学习计算机基础(体系结构、操作系统、算法、网络、数据库系统等),然后在深入学习 Java 和 Java Web。所以我只能先把这个问题放着,继续学习计算机基础。然而这个问题一直在我的心中困扰着我,这也是后来促使我这么强烈地想学习 JVM 的原因。

最近一段时间一直在学习 JVM,先看了 JVM 内部的实现原理,主要是 adaptive code generation, adaptive memory management, 以及 threads and synchronization。然后看了 JVM 优化的方法,主要是如何选择 JIT compiler 和 garbage collector,如何看 GC 日志,如何使用可视化工具分析程序的运行时状态,如何调整 JVM 的参数。

经过一段时间的 JVM 的学习,我准备开始分析我之前写的 hot-crawler 这个项目。下面就是我的分析过程。

  1. 我先不设置 JVM 的内存大小,直接打印 GC 日志看有什么问题。下面是对应的 JVM options 和 GC log。
-XX:+PrintGCDetails -Xloggc:hotcrawler_jvm.log
Java HotSpot(TM) 64-Bit Server VM (25.171-b11) for windows-amd64 JRE (1.8.0_171-b11), built on Mar 28 2018 16:06:12 by "java_re" with MS VC++ 10.0 (VS2010)
Memory: 4k page, physical 16655468k(11303820k free), swap 19145836k(12290304k free)
CommandLine flags: -XX:InitialHeapSize=266487488 -XX:MaxHeapSize=4263799808 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesIndividualAllocation -XX:+UseParallelGC
0.809: [GC (Allocation Failure) [PSYoungGen: 65536K->10724K(76288K)] 65536K->10905K(251392K), 0.0073689 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
1.603: [GC (Allocation Failure) [PSYoungGen: 76260K->10737K(76288K)] 76441K->12235K(251392K), 0.0107562 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
2.292: [GC (Allocation Failure) [PSYoungGen: 76273K->10728K(76288K)] 77771K->18061K(251392K), 0.0105378 secs] [Times: user=0.03 sys=0.00, real=0.01 secs]
2.669: [GC (Allocation Failure) [PSYoungGen: 76264K->10728K(141824K)] 83597K->23640K(316928K), 0.0166216 secs] [Times: user=0.13 sys=0.00, real=0.02 secs]
3.431: [GC (Metadata GC Threshold) [PSYoungGen: 94298K->10728K(141824K)] 107211K->30083K(316928K), 0.0206330 secs] [Times: user=0.13 sys=0.00, real=0.02 secs]
3.452: [Full GC (Metadata GC Threshold) [PSYoungGen: 10728K->0K(141824K)] [ParOldGen: 19355K->26803K(141312K)] 30083K->26803K(283136K), [Metaspace: 20761K->20761K(1069056K)], 0.1067090 secs] [Times: user=0.53 sys=0.02, real=0.11 secs]
4.969: [GC (Allocation Failure) [PSYoungGen: 131072K->19710K(199168K)] 157875K->46522K(340480K), 0.0136119 secs] [Times: user=0.13 sys=0.00, real=0.01 secs]
6.407: [GC (Allocation Failure) [PSYoungGen: 198910K->21995K(265728K)] 225722K->54167K(407040K), 0.0228798 secs] [Times: user=0.16 sys=0.00, real=0.02 secs]

通过上面的 GC 日志,可以看出 HotSpot VM 把我电脑的所有可用空间(4G 多)作为了最大的堆内存空间,然后进行了若干次的 Minor GC。然后进行了一次 Full GC,增加了 Metadata 即 Permanent generation 区域的内存空间大小。然后又进行了几次 Minor GC。以上是全部的 GC 日志,后面就没有任何日志了。

通过分析直接打印 GC 日志,好像看不出什么,没什么收获。

  1. 我决定设置 JVM 各个区域的内存空间大小(heap, young generation, permanent generation) ,然后再观察 GC 日志有什么变化。下面是对应的 JVM options 和 GC log。
-XX:+PrintGCDetails -Xloggc:hotcrawler_jvm.log -Xms188m -Xmx188m -XX:NewSize=70m -XX:MaxNewSize=70m -XX:MetaspaceSize=48m -XX:MaxMetaspaceSize=48m
Java HotSpot(TM) 64-Bit Server VM (25.171-b11) for windows-amd64 JRE (1.8.0_171-b11), built on Mar 28 2018 16:06:12 by "java_re" with MS VC++ 10.0 (VS2010)
Memory: 4k page, physical 16655468k(8988472k free), swap 19145836k(9640588k free)
CommandLine flags: -XX:CompressedClassSpaceSize=41943040 -XX:InitialHeapSize=197132288 -XX:MaxHeapSize=197132288 -XX:MaxMetaspaceSize=50331648 -XX:MaxNewSize=73400320 -XX:MetaspaceSize=50331648 -XX:NewSize=73400320 -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseCompressedClassPointers -XX:+UseCompressedOops -XX:-UseLargePagesIndividualAllocation -XX:+UseParallelGC
0.662: [GC (Allocation Failure) [PSYoungGen: 54272K->8698K(62976K)] 54272K->8969K(183808K), 0.0063849 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
1.212: [GC (Allocation Failure) [PSYoungGen: 62970K->8693K(62976K)] 63241K->11555K(183808K), 0.0120044 secs] [Times: user=0.08 sys=0.03, real=0.01 secs]
1.519: [GC (Allocation Failure) [PSYoungGen: 62965K->8680K(62976K)] 65827K->16997K(183808K), 0.0102365 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
1.822: [GC (Allocation Failure) [PSYoungGen: 62952K->8680K(62976K)] 71269K->21376K(183808K), 0.0082618 secs] [Times: user=0.13 sys=0.00, real=0.01 secs]
2.123: [GC (Allocation Failure) [PSYoungGen: 62952K->8696K(62976K)] 75648K->26253K(183808K), 0.0090796 secs] [Times: user=0.13 sys=0.00, real=0.01 secs]
2.519: [GC (Allocation Failure) [PSYoungGen: 62968K->8696K(48640K)] 80525K->30111K(169472K), 0.0105225 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
2.890: [GC (Allocation Failure) [PSYoungGen: 48632K->13143K(55808K)] 70047K->37449K(176640K), 0.0127155 secs] [Times: user=0.11 sys=0.02, real=0.01 secs]
3.187: [GC (Allocation Failure) [PSYoungGen: 53079K->15846K(48640K)] 77385K->45876K(169472K), 0.0092983 secs] [Times: user=0.06 sys=0.00, real=0.01 secs]
3.740: [GC (Allocation Failure) [PSYoungGen: 48614K->14735K(52224K)] 78644K->47682K(173056K), 0.0079699 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
3.907: [GC (Allocation Failure) [PSYoungGen: 47503K->2562K(50176K)] 80450K->48166K(171008K), 0.0091892 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]
4.044: [GC (Allocation Failure) [PSYoungGen: 33282K->1646K(32768K)] 78886K->49090K(153600K), 0.0036603 secs] [Times: user=0.11 sys=0.00, real=0.00 secs]
4.195: [GC (Allocation Failure) [PSYoungGen: 32366K->1328K(48128K)] 79810K->50120K(168960K), 0.0050500 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
4.376: [GC (Allocation Failure) [PSYoungGen: 27952K->3178K(49152K)] 76744K->53074K(169984K), 0.0036726 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
4.543: [GC (Allocation Failure) [PSYoungGen: 29802K->2879K(49152K)] 79698K->55308K(169984K), 0.0047447 secs] [Times: user=0.13 sys=0.00, real=0.01 secs]
4.857: [GC (Allocation Failure) [PSYoungGen: 29503K->4040K(49152K)] 81932K->58848K(169984K), 0.0043435 secs] [Times: user=0.13 sys=0.00, real=0.00 secs]
5.262: [GC (Allocation Failure) [PSYoungGen: 30664K->1376K(49664K)] 85472K->60099K(170496K), 0.0039789 secs] [Times: user=0.09 sys=0.00, real=0.00 secs]
5.549: [GC (Allocation Failure) [PSYoungGen: 29024K->2343K(49664K)] 87747K->62299K(170496K), 0.0041532 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
6.053: [GC (Allocation Failure) [PSYoungGen: 29991K->4762K(51200K)] 89947K->66211K(172032K), 0.0036624 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
6.323: [GC (Allocation Failure) [PSYoungGen: 34458K->1177K(50176K)] 95907K->66148K(171008K), 0.0032255 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
6.495: [GC (Allocation Failure) [PSYoungGen: 30873K->2340K(52224K)] 95844K->67544K(173056K), 0.0025049 secs] [Times: user=0.00 sys=0.00, real=0.00 secs]
6.826: [GC (Allocation Failure) [PSYoungGen: 34596K->5084K(51712K)] 99800K->71949K(172544K), 0.0051142 secs] [Times: user=0.11 sys=0.02, real=0.01 secs]
7.002: [GC (Allocation Failure) [PSYoungGen: 37340K->2259K(53760K)] 104205K->73418K(174592K), 0.0056459 secs] [Times: user=0.00 sys=0.00, real=0.01 secs]

通过上面的 GC 日志,可用看出 HotSpot VM 实际分配的各个区域的内存比我们设置的要稍微大一点。然后 JVM 进行了多次的 Minor GC,不断地收集 Young generation 的空间,可能 Young generation 空间设置的小了一点,但也感觉不是小了,因为只有项目刚启动时有 Minor GC,后面一直都没有 Minor GC 了。

然而,重点来了,重点是现在没有了 Full GC,这说明 heap 的 Old generation 和 permanent generation 的内存是够用的。通过 Windows 操作系统的进程管理器会发现程序所占的内存远远超过 JVM 堆的内存,而且程序占用的内存还在不断的上升。通过 GC 日志感觉堆内存没什么问题,那么问题出在哪里呢?

  1. 由于 GC 日志找不出问题,于是我决定使用 JConsole 看一下程序的运行时状态。经过一段时间的观察,我似乎找到了问题的根源,那就是 thread 太多了,并且不断的在增加。下图是 thread 的变化曲线。

于是我回想代码中哪里会创建线程或线程池,我想只有一个地方创建了线程池,那就是在爬虫的定时任务中创建线程池进行页面的爬取。这行代码如下所示:

ExecutorService executorService = Executors.newFixedThreadPool(threadPoolNum);

虽然这行创建线程池没什么问题,但是我把创建线程池的代码写在了定时任务执行的方法中,因此,每执行一次任务,就会创建一个线程池和若干的线程。我立刻把线程池定义成了成员变量,然后观察 JVM 的内存和线程,发现一切都稳定、正常了。内存空间和线程数不会一直增加。下图是修复代码后的线程变化图:

通过自己的努力,我终于解决了那个积压已久的问题。在重新部署后的那一刻,我内心是无比的兴奋与激动。

以前自己写代码主要在乎功能的实现,经过这个教训,更让我懂得了软件的性能和稳定性的重要性。

What is Proxy

The Proxy is a design pattern in software design. It provides a surrogate or placeholder for another object to control access to it. It lets you call a method of a proxy object the same as calling a method of a POJO. But the proxy’s methods can do additional things than POJO’s method.

Why Do We Need Proxy

  • It lets you control to access an object. For example, to delay an object’s construction and initialization.
  • It lets you track the object access. For example, tracking methods access that can analyze user behavior or analyze system performance.

How to Use Proxy in Java

There are commonly three ways to implement the proxy pattern in Java: static proxy, JDK dynamic proxy, and CGLib dynamic proxy.

Static Proxy

The static proxy implemented by reference the common class in proxy class. The following example is a basic static proxy implementation:

public interface Subject{
void request();
}

public class RealSubject implements Subject{
public void request(){
System.out.println("request to RealSubject...");
}
}

public interface Proxy extends Subject{}

public class ConcreteProxy implements Proxy{
private RealSubject realSubject = new RealSubject();
public void request(){
System.out.println("before...");
realSubject.request();
System.out.println("after...");
}
}

public class Client{
public static void main(String[] args){
Subject subject = new ConcreteProxy();
subject.request();
}
}

JDK Dynamic Proxy

The JDK dynamic proxy implemented in package java.lang.reflect of Java API. Its underlying implementation is the Java reflection.

The manly class for implemented dynamic proxy are: the interface java.lang.reflect.InvocationHandler, and the class java.lang.reflect.Proxy.

You can call the Proxy‘s a static method Object newProxyInstance(ClassLoader loader, Class<?>[] interfaces, InvocationHandler h) to get a proxy object.

  • The ClassLoader instance commonly use ClassLoader.getSystemClassLoader().
  • The array of Class Class<?>[] is the interfaces you want to implement.
  • The InvocationHandler instance is an invocation handler that handling proxy method calling with the method invoke(Object proxy, Method method, Object[] args). We need to create an invocation handler by implementing the interface InvocationHandler.

The following code example is a basic JDK Dynamic proxy implementation:

public interface Subject{
void request();
}

public class RealSubject implements Subject{
public void request(){
System.out.println("request to RealSubject...");
}
}

public class MyInvocationHandler implements InvocationHandler {
private RealSubject realSubject = new RealSubject();

@Override
public Object invoke(Object proxy, Method method, Object[] args) throws Throwable {
System.out.println("before...");
Object result = method.invoke(realSubject, args);
System.out.println("after...");
return result;
}
}

public class Client {
public static void main(String[] args) {
Subject subject = (Subject) Proxy.newProxyInstance(
ClassLoader.getSystemClassLoader(),
new Class[]{Subject.class}, new MyInvocationHandler());
subject.request();
}
}

Properties of Proxy Classes

  • All proxy classes extend the class java.lang.reflect.Proxy.
  • A proxy class has only one instance field–the invocation handler, which is defined in the Proxy superclass.
  • Any additional data required to carry out the proxy objects’ tasks must be stored in the invocation handler. The invocation handler wrapped the actual objects.
  • The name of proxy classes are not defined. The Proxy class in Java virtual machine generates class names that begin with string $Proxy.
  • There is only one proxy class for a particular class loader and ordered set of interfaces. If you call the newProxyInstance method twice with the same class loader and interface array, you get two objects of the same class. You can get class by Proxy.getProxyClass(). You can test whether a particular Class object represents a proxy class by calling the isProxyClass method of the Proxy class.

CGLib Dynamic Proxy

CGLib (Code Generation Library) is an open source library that capable creating and loading class files in memory during Java runtime. To do that it uses Java bytecode generation library ‘asm’, which is a very low-level bytecode creation tool. CGLib can proxy to objects without Interface.

<dependency>
<groupId>cglib</groupId>
<artifactId>cglib</artifactId>
<version>3.3.0</version>
</dependency>

How to Use CGLib

To create a proxy object using CGLib is almost as simple as using the JDK reflection proxy API. The following code example is a basic CGLib Dynamic proxy implementation:

public interface Subject{
void request();
}

public class RealSubject implements Subject{
public void request(){
System.out.println("request to RealSubject...");
}
}

public class MyMethodInterceptor implements MethodInterceptor {
private final Subject realSubject;

public MyMethodInterceptor(Subject subject){
this.realSubject = subject;
}

@Override
public Object intercept(Object o, Method method, Object[] args, MethodProxy methodProxy) throws Throwable {
System.out.println("before...");
Object result = method.invoke(realSubject, args);
System.out.println("after...");
return result;
}
}

public class Client {
public static void main(String[] args) {
Subject subject = new RealSubject();
MethodInterceptor methodInterceptor = new MyMethodInterceptor(subject);
Subject proxy = (Subject) Enhancer.create(Subject.class, methodInterceptor);
proxy.request();
}
}

The difference between JDK dynamic proxy and CGLib is that name of the classes are a bit different and CGLib do not have an interface.

It is also important that the proxy class extends the original class and thus when the proxy object is created it invokes the constructor of the original class.

Conclusion

Differences between JDK proxy and CGLib:

  • JDK Dynamic proxy can only proxy by interface (so your target class needs to implement an interface, which is then also implemented by the proxy class). CGLib (and javassist) can create a proxy by subclassing. In this scenario the proxy becomes a subclass of the target class. No need for interfaces.
  • JDK Dynamic proxy generates dynamically at runtime using JDK Reflection API. CGLib is built on top of ASM, this is mainly used the generate proxy extending bean and adds bean behavior in the proxy methods.

References

[1] Core Java Volume I Fundamentals by S, Horstmann

[2] Design Patterns: Elements of Reusable Object-Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides

[3] Understanding “proxy” arguments of the invoke method of java.lang.reflect.InvocationHandler - Stack Overflow

[4] cglib - GitHub

[5] Introduction to cglib

[6] Creating a Proxy Object Using cglib

[7] What is the difference between JDK dynamic proxy and CGLib? - Stack Overflow

The EXPLAIN command is the main way to find out how the query optimizer decides to execute queries. This feature has limitations and doesn’t always tell the truth, but its output is the best information available. Interpreting EXPLAIN will also help you learn how MySQL’s optimizer works.

Invoking EXPLAIN

To use EXPLAIN, simply add the word EXPLAIN just before the SELECT keyword in your query. MySQL will set a flag on the query. When it executes the query, the flag cases it to return information about each step in the execution plan, instead of executing it. It returns one or more rows, which show each part of the execution plan and the order of execution. There is a simple example of EXPLAIN query statement:

EXPLAIN SELECT 1;

There is one row in the output per table in the query. If the query joins two tables, there will be two rows of output. The meaning of “table” is fairly broad meaning: it can mean a subquery, a UNION result, and so on.

It’s a common mistake that MySQL doesn’t execute a query when you add EXPLAIN to it. In fact, if the query contains a subquery in the FROM clause, MySQL actually executes the subquery, places its results into a temporary table, and finishes optimizing the outer query. It has to process all such subqueries before it can optimize the outer query fully, which it must do for EXPLAIN. This means EXPLAIN can actually cause a great deal of work for the server if the statement contains expensive subqueries or views that use the TEMPTABLE algorithm.

EXPLAIN is an approximation. Sometimes it’s a good approximation, but at other times, it can be very far from the truth. Here are some of its limitation:

  • EXPLAIN doesn’t tell you anything about how triggers, stored functions, or UDFs will affect your query.
  • It doesn’t work for stored procedures.
  • It doesn’t tell you about optimization MySQL does during query execution.
  • Some of the statistics it shows are estimates and can be very inaccurate.
  • It doesn’t show you everything there is to know about a query’s execution plan.
  • It doesn’t distinguish between some things with the same name. For example, it uses “filesort” for in-memory sorts and for temporary files, and it displays “Using temporary” for temporary tables on disk and in memory.

Rewriting Non-SELECT Queries

MySQL explain only SELECT queries, not stored routine calls or INSERT, UPDATE, DELETE, or any other statements. However, you can rewrite some non-SELECT queries to be EXPLAIN-able. To do this, you just need to convert the statement into an equivalent SELECT that accesses all the same columns. Any columns mentioned must be in a SELECT list, a join clause, or a WHERE clause. Note that since MySQL 5.6, it can explain INSERT, UPDATE, and DELETE statements.

For example:

UPDATE actor 
INNER JOIN film_actor USING(actor_id)
SET actor.last_update=film_actor.last_update;

To

EXPLAIN SELECT actor.last_update, film_actor.last_update 
FROM actor
INNER JOIN film_actor USING(actor_id)

The Columns in EXPLAIN

EXPLAIN’s output always has the same columns, which adds a filtered column in MySQL 5.1.

Keep in mind that the rows in the output come in the order in which MySQL actually executes the parts of the query, which is not always the same as the order in which they appear in the original SQL. In EXPLAIN of MySQL, there are columns: id, select_type, table, partitions, type, possible_keys, key_len, ref, rows, filtered, Extra.

The id Column

The column always contains a number, which identifies the SELECT to which the row belongs.

MySQL divides SELECT queries into simple and complex types, and the complex types can be grouped into three broad classes: simple subqueries, derived tables (subqueries in the FROM clause), and UNIONs. Here is a simple subquery example:

EXPLAIN SELECT (SELECT 1 FROM sakila.actor LIMIT 1) FROM sakila.film;
+----+-------------+-------+...
| id | select_type | table |...
+----+-------------+-------+...
| 1 | PRIMARY | film |...
| 2 | SUBQUERY | actor |...
+----+-------------+-------+...

Here is a derived tables query example:

EXPLAIN SELECT film_id FROM (SELECT film_id FROM sakila.film) AS der;
+----+-------------+------------+...
| id | select_type | table |...
+----+-------------+------------+...
| 1 | PRIMARY | <derived2> |...
| 2 | DERIVED | film |...
+----+-------------+------------+...

A UNION query example:

EXPLAIN SELECT 1 UNION ALL SELECT 1;
+------+--------------+------------+...
| id | select_type | table |...
+------+--------------+------------+...
| 1 | PRIMARY | NULL |...
| 2 | UNION | NULL |...
| NULL | UNION RESULT | <union1,2> |...
+------+--------------+------------+...

UNION results are always placed into an anonymous temporary table, and MySQL then reads the results back out of the temporary table. The temporary table doesn’t appear in the original SQL, so its id columns is NULL.

The select_type Column

This column shows whether the row is a simple or complex SELECT (and if it’s the latter, which of the three complex types it is). The value SIMPLE means the query contains no subqueries or UNIONs. If the query has any such complex subparts, the outermost part is labeled PRIMARY, and other parts are labeled as follows:

  • SUBQUERY. A SELECT that is contained in a subquery in the SELECT clause (not in the FROM clause) is labeled SUBQUERY.
  • DERIVED. A SELECT that is contained in a subquery in the FROM clause. The server refers to this as a “derived table” internally, because the temporary table is derived from the subquery.
  • UNION. The second and subsequent SELECTs in a UNION are labeled as UNION. The first SELECT is labeled PRIMARY as though it is executed as part of the outer query.
  • UNION RESULT. The SELECT used to retrieve results from the UNION’s anonymous temporary table is labeled as UNION RESULT.

In addition to these values, a SUBQUERY and a UNION can be labeled as DEPENDENT and UNCACHEABLE. DEPENDENT means the SELECT depends on data that is found in an outer query; UNCACHEABLE means something in the SELECT prevents the result form being cached with an Item_cache (such as the RAND() function).

UNCACHEABLE UNION example:

EXPLAIN select tmp.id 
FROM (SELECT * FROM test t1 WHERE t1.id=RAND()
UNION ALL
SELECT * FROM test t2 WHERE t2.id=RAND()) AS tmp;
+----+-------------------+------------+...
| id | select_type | table |...
+----+-------------------+------------+...
| 1 | PRIMARY | <derived2> |...
| 2 | DERIVED | t1 |...
| 3 | UNCACHEABLE UNION | t2 |...
+----+-------------------+------------+...

The table Column

This column shows which table the row is accessing. It’s the table, or its alias. The number of the rows in EXPLAIN equals the sum of number of SELECT, JOIN and UNION.

MySQL’s query execution plans are always left-deep trees. The leaf nodes in order correspond directly to the rows in EXPLAIN. For example:

EXPLAIN SELECT film.film_id
FROM sakila.film
INNER JOIN sakila.film_actor USING(film_id)
INNER JOIN sakila.actor USING(actor_id);
+----+-------------+------------+...
| id | select_type | table |...
+----+-------------+------------+...
| 1 | SIMPLE | actor |...
| 1 | SIMPLE | film_actor |...
| 1 | SIMPLE | film |...
+----+-------------+------------+...

Derived tables and unions

The table column becomes much more complicated when there is a subquery in the FROM clause or a UNION. In these cases, there really isn’t a “table” to refer to, because the anonymous temporary table MySQL creates exists only while the query is executing.

When there’s a subquery in the FROM clause, the table column is of the form <derivedN>, where N is the subquery’s id. This always a “forward reference”. In other words, N refers to a later row in the EXPLAIN output.

When there’s a UNON, the UNION RESULT table column contains a list of ids that participate in the UNION. This is always a “backward reference”, because the UNION RESULT comes after all of the rows that participate in the UNION.

Example of a complex SELECT types:

1 EXPLAIN
2 SELECT actor_id,
3 (SELECT 1 FROM sakila.film_actor WHERE film_actor.actor_id =
4 der_1.actor_id LIMIT 1)
5 FROM (
6 SELECT actor_id
7 FROM sakila.actor LIMIT 5
8 ) AS der_1
9 UNION ALL
10 SELECT film_id,
11 (SELECT @var1 FROM sakila.rental LIMIT 1)
12 FROM (
13 SELECT film_id,
14 (SELECT 1 FROM sakila.store LIMIT 1)
15 FROM sakila.film LIMIT 5
16 ) AS der_2;
+------+----------------------+------------+...
| id | select_type | table |...
+------+----------------------+------------+...
| 1 | PRIMARY | <derived3> |...
| 3 | DERIVED | actor |...
| 2 | DEPENDENT SUBQUERY | film_actor |...
| 4 | UNION | <derived6> |...
| 6 | DERIVED | film |...
| 7 | SUBQUERY | store |...
| 5 | UNCACHEABLE SUBQUERY | rental |...
| NULL | UNION RESULT | <union1,4> |...
+------+----------------------+------------+...

Reading EXPLAIN’s output often requires you to jump forward and backward in the list.

The type Column

The MySQL manual says this column shows the “join type”, but it’s more accurate to say the access type. In other words, how MySQL has decided to find rows in the table. Here are the most important access methods, from worst to best:

  • ALL. This means a table scan. MySQL must scan through the table from beginning to end to find the row. (There are exceptions, such as queries with LIMIT or queries that display “Using distinct/not exist” in the Extra column.)
  • index. This means an index scan. The main advantage is that this avoids sorting. The biggest disadvantage is the cost of reading an entire table in index order. This usually means accessing the rows in random order, which is very expensive. If you also see “Using index” in the Extra column, it means MySQL is using a covering index. This is much less expensive than scanning the table in index order.
  • range. A range scan is a limited index scan. It begins at some point in the index and returns rows that match a range of values. This is better than a full index scan because it doesn’t go through the entire index. Obvious range scans are queries with a BETWEEN or > in the WHERE clause.
  • ref. This is an index access (or index lookup) that returns rows that match a single value. The ref_or_null access type is a variation on ref. It means MySQL must do a second lookup to find NULL entries after doing the initial lookup.
  • eq_ref. This is an index lookup that MySQL knows will return at most a single value. You will see this access method when MySQL decides to use a primary key or unique index to satisfy the query by comparing it to some reference value.
  • const, system. MySQL uses these access types when it can optimize away some part of the query and turn it into a constant. The table has at most one matching row, which is read at the start of the query. For example, if you select a row’s primary key by placing it primary key into then where clause. e.g SELECT * FROM <table_name> WHERE <primary_key_column>=1;
  • NULL. This access method means MySQL can resolve the query during the optimization phase and will not even access the table or index during the execution stage. For example, selecting the minimum value from an indexed column can be done by looking at the index alone and requires no table access during execution.

Other types

  • fulltext. The join is performed using a FULLTEXT index.
  • index_merge. This join type indicates that the Index Merge optimization is used. In this case, the key column in the output row contains a list of indexes used. Indicates a query to make limited use of multiple indexes from a single table. For example, the film_actor table has an index on film_id and an index on actor_id, the query is SELECT film_id, actor_id FROM sakila.film_actor WHERE actor_id = 1 OR film_id = 1
  • unique_subquery. This type replaces eq_ref for some IN subqueries of the following form. value IN (SELECT primary_key FROM single_table WHERE some_expr)
  • index_subquery. This join type is similar to unique_subquery. It replaces IN subqueries, but it works for nonunique indexes in subqueries. value IN (SELECT key_column FROM single_table WHERE some_expr)

The possible_keys Column

This column shows which indexes could be used for the query, based on the columns the query accesses and the comparison operators used. This list is created early in the optimization phase, so some of the indexes listed might be useless for the query after subsequent optimization phases.

The key Column

This column shows which index MySQL decided to use to optimize the access to the table. If the index doesn’t appear in possible_keys, MySQL chose it for another reason–for example, it might choose a covering index even when there is no WHERE clause.

In other words, possible_keys reveals which indexes can help make row lookups efficient, but key shows which index the optimizer decided to use to minimize query cost.

Here’s an example:

EXPLAIN SELECT actor_id, film_id FROM sakila.film_actor;
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: film_actor
type: index
possible_keys: NULL
key: idx_fk_film_id
key_len: 2
ref: NULL
rows: 5143
Extra: Using index

The key_len Column

This column shows the number of bytes MySQL will use in the index. If MySQL is using only some of the index’s columns, you can use this value to calculate which columns it uses. Remember that MySQL 5.5 and older versions can use only the left most prefix of the index. For example, file_actor’s primary key covers two SMALLINT columns, and a SMALLINT is two bytes, so each tuple in the index is four bytes. Here’s a query example:

EXPLAIN SELECT actor_id, film_id FROM sakila.film_actor WHERE actor_id=4;
...+------+---------------+---------+---------+...
...| type | possible_keys | key | key_len |...
...+------+---------------+---------+---------+...
...| ref | PRIMARY | PRIMARY | 2 |...
...+------+---------------+---------+---------+...

You can use EXPLAIN to get how many bytes in one row of the index, then to calculate the query uses how much rows index by the key_len of the EXPLAIN.

MySQL doesn’t always show you how much of an index is really being used. For example, if you perform a LIKE query with a prefix pattern match, it will show that the full width of the column is being used.

The key_len column shows the maximum possible length of the indexed fields, not the actual number of bytes the data in the table used.

The ref Column

This column shows which columns or constant from preceding tables are being used to look up values in the index named in the key column.

The rows Column

This column shows the number of rows MySQL estimates it will need to read to find the desired rows.

Remember, this is the number of rows MySQL thinks it will examine, not the number of rows in the result set. Also realize that there are many optimizations, such as join buffers and caches, that aren’t factored into the number of rows shown.

The filtered Column

This column shows a pessimistic estimate of the percentage of rows that will satisfy some condition on the table, such as a WHERE clause or a join condition. If you multiply the rows column by this percentage, you will see the number of rows MySQL estimates it will join with the previous tables in the query plan.

The Extra Colum

This column contains extra information that doesn’t fit into other columns. The most important values you might frequently are as follows:

  • Using index. This indicates that MySQL will use a covering index to avoid accessing the table.
  • Using where. This means the MySQL server will post-filter rows after the storage engine retrieves them.
  • Using temporary. This means MySQL will use a temporary table while sorting the query’s result.
  • Using filesort. This means MySQL will use an external sort to order the results, instead of reading the rows from the table in index order. MySQL has two filesort algorithms. Either type can be done in memory or on disk. EXPLAIN doesn’t tell you which type of filesort MySQL will use, and it doesn’t tell you whether the sort will be done in memory or on disk.
  • Range checked for each record (index map:N). This value means there’s no good index, and the indexes will be reevaluated for each row in a join. N is a bitmap of the indexes shown in possible_keys and is redundant.
  • Using index condition: Tables are read by accessing index tuples and testing them first to determine whether to read full table rows.

Improvements in MySQL 5.6

Some EXPLAIN improvements in MySQL 5.6

  • To explain queries such as UPDATE, INSERT, and so on.
  • A variety of improvements to the query optimizer and execution engine that allow anonymous temporary tables to be materialized as late as possible, rather than always creating and filling them before optimizing and executing the portions of the query that refer to them. This will allow MySQL to explain queries with subqueries instantly, without having to actually execute the subqueries first.

Conclusion

The most important columns of EXPLAIN are type and Extra. They determine does the query uses an index or covering index.

the most important access methods (the type column of EXPLAIN), from worst to best:

  • ALL
  • index
  • range
  • ref
  • eq_ref
  • const, system
  • NULL

References

[1] High Performance MySQL by Baron Schwartz, Vadim Tkachenko, Peter Zaitsev, Derek J. Balling

[2] EXPLAIN Output Format - MySQL 8.0 Documentation

0%