Zookeeper 教程
以下是 Apache ZooKeeper 的完整入门教程(中文版),从零基础到实际部署与使用。内容基于 ZooKeeper 3.8.x 系列(截至 2025 年 10 月仍为主流稳定版)。
一、ZooKeeper 是什么?
Apache ZooKeeper 是一个开源的分布式协调服务,用于分布式系统中:
- 配置管理
- 分布式锁
- 命名服务
- 主从选举
- 状态同步
核心思想:提供一个高可用、强一致性、顺序性的小型文件系统(类似树形结构)
二、核心概念
| 概念 | 说明 |
|---|---|
| ZNode | 节点(类似文件系统中的文件/目录) |
| 数据节点 (Data Node) | 存储数据的 ZNode |
| 临时节点 (Ephemeral) | 客户端断开即删除 |
| 顺序节点 (Sequential) | 自动追加序号,如 /lock/task-00000001 |
| Watcher | 事件监听机制(一次性触发) |
| ACL | 访问控制列表 |
| Session | 客户端与 ZooKeeper 的会话 |
三、安装与启动(单机模式)
1. 下载
wget https://downloads.apache.org/zookeeper/zookeeper-3.8.4/apache-zookeeper-3.8.4-bin.tar.gz
tar -zxvf apache-zookeeper-3.8.4-bin.tar.gz
cd apache-zookeeper-3.8.4-bin
2. 配置(单机)
cp conf/zoo_sample.cfg conf/zoo.cfg
编辑 conf/zoo.cfg:
tickTime=2000
dataDir=/tmp/zookeeper/data
dataLogDir=/tmp/zookeeper/logs
clientPort=2181
创建目录:
mkdir -p /tmp/zookeeper/data
mkdir -p /tmp/zookeeper/logs
3. 启动
bin/zkServer.sh start
查看状态:
bin/zkServer.sh status
四、客户端命令行操作(zkCli)
bin/zkCli.sh -server 127.0.0.1:2181
常用命令
| 命令 | 功能 |
|---|---|
ls / | 查看根节点 |
create /myapp "hello" | 创建持久节点 |
create -e /temp "tmp" | 创建临时节点 |
create -s /tasks/task "" | 创建顺序节点 |
get /myapp | 获取数据 |
set /myapp "world" | 修改数据 |
delete /myapp | 删除节点(无子节点) |
rmr /myapp | 递归删除 |
getAcl /myapp / setAcl | ACL 操作 |
五、Java 客户端编程(原生 API)
1. 添加依赖(Maven)
<dependency>
<groupId>org.apache.zookeeper</groupId>
<artifactId>zookeeper</artifactId>
<version>3.8.4</version>
</dependency>
2. 基本连接与操作
import org.apache.zookeeper.*;
import java.util.concurrent.CountDownLatch;
public class ZkDemo {
private static final String CONNECT_STRING = "localhost:2181";
private static final int SESSION_TIMEOUT = 5000;
private static final CountDownLatch connectedSignal = new CountDownLatch(1);
public static void main(String[] args) throws Exception {
ZooKeeper zk = new ZooKeeper(CONNECT_STRING, SESSION_TIMEOUT, watchedEvent -> {
if (watchedEvent.getState() == Watcher.Event.KeeperState.SyncConnected) {
connectedSignal.countDown();
}
});
connectedSignal.await(); // 等待连接
// 创建节点
String path = zk.create("/myapp", "hello zk".getBytes(),
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.PERSISTENT);
System.out.println("创建节点: " + path);
// 获取数据 + 监听
byte[] data = zk.getData("/myapp", watchedEvent -> {
System.out.println("节点 /myapp 数据变更!");
}, null);
System.out.println("数据: " + new String(data));
// 修改数据
zk.setData("/myapp", "new value".getBytes(), -1);
Thread.sleep(2000);
zk.close();
}
}
六、常见应用场景实战
1. 分布式锁(可重入)
public class DistributedLock {
private ZooKeeper zk;
private String lockPath = "/locks/my_lock";
private String lockNode;
public boolean tryLock() throws Exception {
lockNode = zk.create(lockPath + "/lock-", null,
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);
List<String> children = zk.getChildren(lockPath, false);
Collections.sort(children);
if (lockNode.endsWith(children.get(0))) {
return true; // 拿到锁
} else {
// 监听前一个节点
String prev = getPrevNode(children);
Stat stat = zk.exists(lockPath + "/" + prev, watchedEvent -> {
if (watchedEvent.getType() == Watcher.Event.EventType.NodeDeleted) {
// 前节点删除,重新尝试
tryLock();
}
});
return false;
}
}
public void unlock() throws Exception {
zk.delete(lockNode, -1);
}
}
2. 主从选举(Leader Election)
public class LeaderElection implements Watcher {
private ZooKeeper zk;
private String electionPath = "/election";
private String currentNode;
public void elect() throws Exception {
currentNode = zk.create(electionPath + "/n_", null,
ZooDefs.Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);
checkMaster();
}
private void checkMaster() throws Exception {
List<String> children = zk.getChildren(electionPath, false);
Collections.sort(children);
if (currentNode.endsWith(children.get(0))) {
System.out.println("我是主节点!");
becomeLeader();
} else {
watchPrevious();
}
}
private void watchPrevious() throws Exception {
List<String> children = zk.getChildren(electionPath, false);
Collections.sort(children);
int idx = children.indexOf(currentNode.substring(electionPath.length() + 1));
String prev = children.get(idx - 1);
zk.exists(electionPath + "/" + prev, this);
}
@Override
public void process(WatchedEvent event) {
if (event.getType() == Event.EventType.NodeDeleted) {
try { checkMaster(); } catch (Exception e) {}
}
}
}
七、生产集群部署(3 节点)
1. 每台机器配置 zoo.cfg
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
clientPort=2181
server.1=zk1:2888:3888
server.2=zk2:2888:3888
server.3=zk3:2888:3888
2. 创建 myid 文件
# zk1
echo "1" > /var/lib/zookeeper/myid
# zk2
echo "2" > /var/lib/zookeeper/myid
3. 启动集群
bin/zkServer.sh start
八、Curator 框架(推荐!简化 API)
1. 添加依赖
<dependency>
<groupId>org.apache.curator</groupId>
<artifactId>curator-recipes</artifactId>
<version>5.5.0</version>
</dependency>
2. 分布式锁示例
CuratorFramework client = CuratorFrameworkFactory.newClient("zk1:2181", new RetryNTimes(3, 1000));
client.start();
InterProcessMutex lock = new InterProcessMutex(client, "/locks/my_lock");
if (lock.acquire(10, TimeUnit.SECONDS)) {
try {
System.out.println("获取锁成功");
} finally {
lock.release();
}
}
九、监控与运维
| 工具 | 用途 |
|---|---|
zkCli.sh | 命令行管理 |
| 四字命令 | echo stat | nc localhost 2181 |
| JMX | 开启 -Dzookeeper.jmx.enabled=true |
| Zookeeper Exporter | Prometheus 监控 |
常用四字命令:
echo mntr | nc localhost 2181 # 监控指标
echo ruok | nc localhost 2181 # 是否存活 (imok)
echo stat | nc localhost 2181 # 状态
十、常见问题
| 问题 | 解决方案 |
|---|---|
ConnectionLoss | 检查网络、Session 超时 |
| 节点已存在 | 使用 create(..., CreateMode.PERSISTENT_SEQUENTIAL) |
| Watcher 不触发 | Watcher 是一次性的,需重新注册 |
| 集群启动失败 | 检查 myid、端口、防火墙 |
学习资源
- 官方文档:https://zookeeper.apache.org/doc/current/
- Curator:https://curator.apache.org
- GitHub 示例:https://github.com/apache/zookeeper/tree/master/zookeeper-recipes
动手练习建议:
- 搭建 3 节点集群
- 用 Java 实现分布式锁
- 用 Curator 实现主从选举
- 写一个配置中心(监听
/config节点)
需要我提供 完整可运行的 Demo 项目(GitHub) 或 Docker Compose 集群配置 吗?可以继续问!