完善数据采集程序

第8篇文章在引入join使用场景的时候,有个信息采集功能的案例:有若干台采集服务器,然后还有一台主机,这台主机需要等待这若干台服务器信息采集完之后再做进一步的处理,一台采集服务器就对应一个线程,如之前写的代码:

public class ThreadJoin3 {
    public static void main(String[] args) throws InterruptedException {
        long startTimestamp = System.currentTimeMillis();

        // 假设有三台机器,开启三个线程。
        Thread m1 = new Thread(new CaptureRunnable("M1", 1_000L));
        Thread m2 = new Thread(new CaptureRunnable("M2", 2_000L));
        Thread m3 = new Thread(new CaptureRunnable("M3", 3_000L));

        m1.start();
        m2.start();
        m3.start();
        m1.join();
        m2.join();
        m3.join();

        long endTimestamp = System.currentTimeMillis();

        System.out.printf("Save data begin timestamp is %s, end timestamp is %s\n", startTimestamp, endTimestamp);
        System.out.printf("Spend time is %s", endTimestamp - startTimestamp);
    }
}

/**
 * 采集服务器节点的任务。
 */
class CaptureRunnable implements Runnable {
    // 机器节点的名称
    private String machineName;
    // 采集花费时间
    private long spendTime;

    public CaptureRunnable(String machineName, long spendTime) {
        this.machineName = machineName;
        this.spendTime = spendTime;
    }

    @Override
    public void run() {
        // do the really capture data.
        try {
            Thread.sleep(spendTime);
            System.out.printf(machineName + " completed data capture at timestamp [%s] and successful.\n", System.currentTimeMillis());
        } catch (InterruptedException e) {
            e.printStackTrace();
        }
    }

    public String getResult() {
        return machineName + " finish.";
    }
}

为了获取三个线程统一的结束时间,使用了join,这是三个线程三个服务器的情况,如果有成千上万台那怎么办,不可能每台服务器对应一个线程,因为线程有一个stackSize的上限,此时就需要用到线程同步来避免这个问题了,我们一步一步完善这个功能:

首先创建十个线程,这里采用流的方式来创建,如下:

public class CaptureService {
    public static void main(String[] args) {
        Stream.of("M1", "M2", "M3", "M4", "M5", "M6", "M7", "M8", "M9", "M10")
                .map(CaptureService::createCaptureService)
                .forEach(t -> {
                    t.start();
                    try {
                        t.join();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                });
        Optional.of("All of capture work finished.").ifPresent(System.out::println);
    }

    public static Thread createCaptureService(String name) {
        return new Thread(() -> {
            //TODO
            System.out.println(Thread.currentThread().getName());
        }, name);

    }
}

运行效果如下:

可以看到线程都跑起来了,但是这样实际是有问题的,因为写在foreach里是只等待当前线程执行完了而不是所有的线程执行完。

这里使用java8的Stream.forEach(),因为这个方法是一个teminal operation,这个后续再深入学习,先知道有这么个东西,运行效果如下:

接下来完善createCaptureService:

1.我们在CaptureService增加一个MAX_WORKER来代表最大执行线程数。

private static final int MAX_WORKER = 5;

2.定义一个class,作为执行代码锁的控制器Control。然后放在一个链表里。

private static final LinkedList<Control> CONTROLS = new LinkedList<>();

3.如果链表的长度大于最大执行线程数,就wait();

while (CONTROLS.size() > MAX_WORKER) {
    try {
        CONTROLS.wait();
    } catch (InterruptedException e) {
        e.printStackTrace();
    }
}

4.如果链表长度小于最大执行线程数,就代表链表里还有空间,那么就在链表里增加一个Control的对象。

CONTROLS.addLast(new Control());

5.然后写采集的业务,采集业务执行后,先把链表里的第一个移除。也就是先进先出。同时notifyAll()wait的线程,这时候一个线程抢导锁就会继续执行上面的4步的操作,依次类推下去:

synchronized (CONTROLS) {
    Optional.of("The worker [" + Thread.currentThread().getName() + "] END capture data.").ifPresent(System.out::println);
    CONTROLS.removeFirst();
    CONTROLS.notifyAll();
}

整体代码如下:

/**
 * @program: ThreadDemo
 * @description: 数据采集功能:利用多个线程采集多台服务器运行状态信息。
 * 当服务器数量较少时,可以采取一个线程采集一台服务器;
 * 但是服务器数量非常大时,将不可能采取这种方式。
 * 可以开启一定数量的线程采集完成后再采集其他服务器,即运行的线程始终保持着稳定数量。
 * @author: hs96.cn@Gmail.com
 * @create: 2020-09-08
 */
public class CaptureService {

    private static final int MAX_WORKER = 5;
    private static final LinkedList<Control> CONTROLS = new LinkedList<>();

    public static void main(String[] args) {
        List<Thread> worker = new ArrayList<>();
        Stream.of("M1", "M2", "M3", "M4", "M5", "M6", "M7", "M8", "M9", "M10")
                .map(CaptureService::createCaptureService)
                .forEach(t -> {
                    t.start();
                    worker.add(t);
                });

        worker.stream().forEach(t -> {
            try {
                t.join();
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        });

        Optional.of("All of capture work finished.").ifPresent(System.out::println);
    }

    public static Thread createCaptureService(String name) {
        return new Thread(() -> {
            // Optional可以防止NPE空指针异常
            Optional.of("The worker [" + Thread.currentThread().getName() + "] BEGIN capture data.").ifPresent(System.out::println);
            synchronized (CONTROLS) {
                while (CONTROLS.size() > MAX_WORKER) {
                    try {
                        CONTROLS.wait();
                    } catch (InterruptedException e) {
                        e.printStackTrace();
                    }
                }

                CONTROLS.addLast(new Control());
            }

            Optional.of("The worker [" + Thread.currentThread().getName() + "] is WORKING...").ifPresent(System.out::println);
            try {
                Thread.sleep(10_000);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }

            synchronized (CONTROLS) {
                Optional.of("The worker [" + Thread.currentThread().getName() + "] END capture data.").ifPresent(System.out::println);
                CONTROLS.removeFirst();
                CONTROLS.notifyAll();
            }
        }, name);

    }

    private static class Control{

    }
}

运行结果如下:

posted @ 2020-09-08 01:40  风暴松鼠  阅读(327)  评论(0)    收藏  举报