Java versus Go: a performance comparison

Summary: Comparing the performance of Go with Java when filtering the lines of a large log file using regular expressions.

Together with a friend, I recently pondered an idea to build a new kind of time-series database to process log files. The question we discussed was: what programming language would be most suitable for the task?

We are both experienced Java developers. But my friend was convinced that Go, with it light-weight threads and C roots, would outperform Java. I don’t know Go well and wanted to see if I can confirm this using a simple test.

In this post I’d like to find out if Go really is that faster than Java (for that particular case).

The task

The scenario is following: we want to process a large (>1GB) log file by applying a regular expression on each line, and if a match was found, print the line out to the console.

Setup

We are using Go version 1.5. The test code is available in github1. Each test needs to compiled separately.

Creating the sample log file

First we need to have a log file to process. We create it ourselves using Go. The lines should have following format:

2015-11-12 15:02:28.93508032 +0100 CET -- log entry #0

Numbering each line has the advantage that we get an idea of progress of the processing.

On my machine, a log file with 20.000.000 lines creates a file with a size of around 1.24 GB. We create a log file like this:

create-log-file.go
func main() {
    log.Println("Creating log file: " + logFile)
    fo, err := os.Create(logFile)
    now := time.Now()
    for i:=0; i < numberOfLines; i++  {
        date := now.Add(time.Duration(i) * time.Second)
        _, err = fo.WriteString(fmt.Sprintf("%v -- %v\n", date, fmt.Sprintf("log entry #%v", i)))
        if i % 100000 == 0 {
            log.Printf("... created %v lines", i)
        }
    }
    log.Printf("Done writing %v lines", logFile)
}

Create the logfile using:

$ go build create-log-file.go
$ ./create-log-file

Regex filter

The filter we are using is this: .*\#[15][15]1110+$, i.e entries that start with 1 or 5, followed by 1 or 5, followed by 111 and ending with at least one 0. Quite arbitrary, but this filter spreads the matches relatively evenly across concurrent processes. There are 10 lines that match in our log sample.

Implementation

There are basically two costly operations:

  • reading each line from the log file
  • applying a regular expression on each line.

This is a task that should gain from being executed in parallel. If we had 20 million cores, each core could process one line (however, the feeding of the cores would still be happening sequentially).

Go implementation

Note: GOMAXPROCS

Before version 1.5 of Go, it was necessary to set the number of logical processors in Go when using parallel execution. Like this:

cpu := runtime.NumCPU()
log.Printf("Num cpu: %v\n", cpu)
runtime.GOMAXPROCS(cpu)

This is not necessary anymore 2, so in my tests I won’t tweak this setting (more concretely, I won’t write about this, in fact, I did tweak it and it made no difference whatsoever. Try it yourself if that interests you).

Looping through the log file

We start with measuring the time it takes to loop through the log file. Go offers basically two ways to do that: using bufio.Scanner and bufio.Reader.

Note: In the code examples I’m only showing the relevant parts for brevity (e.g I omit error handling). Please see the github repo for the complete implementation.

scanner-performance.go
func main() {
    start := time.Now()
    log.Printf("Started: %v", start)
    inputFile, err := os.Open("log-sample.txt")
    scanner := bufio.NewScanner(inputFile)
    i := 0
    for scanner.Scan() {
        line := scanner.Bytes()
        i++
        if line == nil {} // just to avoid compilation error
    }
    fmt.Printf("Count: %v\nElapsed time: %v\n", i, time.Since(start))
}
$ go build scanner-performance.go
$ ./scanner-performance 
2015/11/20 10:21:53 Started: 2015-11-20 10:21:53.717315327 +0100 CET
Count: 20000000
Elapsed time: 2.266481235s

That looks promising. We process 20 million lines in 2.2 seconds. Let’s see how this compares to bufio.Reader:

reader-performance.go
func main() {
    start := time.Now()
    log.Printf("Started: %v", start)
    inputFile, err := os.Open("log-sample.txt")
    defer inputFile.Close()

    reader := bufio.NewReader(inputFile)
    i := 0
    for _, _, err := reader.ReadLine(); err==nil;
    _, _, err = reader.ReadLine(){
        i++;
    }
    fmt.Printf("Count: %v\nElapsed time: %v\n", i, time.Since(start))
}

The result:

$ go build reader-performance.go 
$ ./reader-performance 
2015/11/20 10:24:54 Started: 2015-11-20 10:24:54.322864572 +0100 CET
Count: 20000000
Elapsed time: 1.843919688s

That’s even faster so let’s use Reader.

Regular expression in Go

We are going to use the regexp module of go:

var regex = regexp.MustCompile(".*#[15][15]1110+$")
...
if regex.Match(string) ....

Sequential solution

The sequential solution is straightforward: open the log file and filter each line using a regex.

sequential.go
var regex = regexp.MustCompile(".*#[15][15]1110+$")
func main() {
    count := 0
    start := time.Now()
    log.Printf("Started: %v", start)
    inputFile, err := os.Open(logfile.LogFile)
    defer inputFile.Close()
    reader := bufio.NewReader(inputFile)
    for line, _, err := reader.ReadLine(); err==nil;
        line, _, err = reader.ReadLine(){
        if regex.Match(line) {
            fmt.Printf("Match: %v\n", string(line))
        }
    }
    fmt.Printf("Elapsed time: %v\n", time.Since(start))
}

Run with:

$ go build sequential.go 
$ ./sequential 

The result

2015/11/20 10:29:40 Started: 2015-11-20 10:29:40.944923413 +0100 CET
Match: 2015-11-19 21:14:40.002737283 +0100 CET -- log entry #111110
Match: 2015-11-20 08:21:20.002737283 +0100 CET -- log entry #151110
Match: 2015-11-24 12:21:20.002737283 +0100 CET -- log entry #511110
Match: 2015-11-24 23:28:00.002737283 +0100 CET -- log entry #551110
Match: 2015-12-01 11:01:10.002737283 +0100 CET -- log entry #1111100
Match: 2015-12-06 02:07:50.002737283 +0100 CET -- log entry #1511100
Match: 2016-01-16 18:07:50.002737283 +0100 CET -- log entry #5111100
Match: 2016-01-21 09:14:30.002737283 +0100 CET -- log entry #5511100
Match: 2016-03-26 04:46:10.002737283 +0100 CET -- log entry #11111000
Match: 2016-05-11 12:52:50.002737283 +0200 CEST -- log entry #15111000
Elapsed time: 1m38.747270878s

Around one and a half minutes. That is rather disappointing.

After a little bit of research and reading about Go’s regex implementation, we may improve the performance of the regex. But strictly speaking, that should not be allowed in this test, as we want to offer a system that can deal with all sorts of regex (we’d need to implement some internal optimization then).

If we change the regex from .*#[15][15]1110+$ to ^[^#]*#[15][15]1110+$, we get following result:

2015/11/20 10:34:27 Started: 2015-11-20 10:34:27.920836865 +0100 CET
...
Elapsed time: 1m2.590272369s

Indeed, this regex performs better. Still, one minute is not very satisfying, we are aiming for something much faster.

Parallel processing

How can we try to get more concurrency into our program? The best would be if we could just split the log file into even chunks and process those in parallel. But as we need to process complete lines (of arbitrary length), I don’t see an immediate solution for that.

We still could split the files in a pre-processing step. Or we could fill an array with lines and process that in parallel.

But all this is rather unusual. The “classical” way to process a task like this concurrently is to insert a queue into the process. The inventors of Go have given us already everything we need to that, namely, goroutines and channels.

Let’s try this then: the main process goes through the log file and puts each line into a queue. A set of workers pick up the lines from the queue and apply the regular expression. We are only going to use an unbuffered channel, as this guarantees message delivery.

parallel.go
const numberWorkers = 4
var waitGroup sync.WaitGroup
var regex = regexp.MustCompile("^[^#]*#[15][15]1110+$")
func main() {
    start := time.Now()
    log.Printf("Started: %v", start)
    inputFile, err := os.Open("log-sample.txt")
    defer inputFile.Close()

    waitGroup.Add(numberWorkers)

    queue := make(chan []byte)
    for gr := 1; gr <= numberWorkers; gr++ {
        go worker(queue, gr)
    }

    reader := bufio.NewReader(inputFile)
    for line, _, err := reader.ReadLine(); err==nil;
    line, _, err = reader.ReadLine(){
        queue <- line
    }

    close(queue)
    waitGroup.Wait()
    fmt.Printf("Elapsed time: %v\n", time.Since(start))
}

func worker(queue chan []byte, id int) {
    defer waitGroup.Done()
    for {
        line, ok := <-queue
        if !ok {
            fmt.Printf("Worker: %d : Shutting Down\n", id)
            return
        }
        if regex.Match(line) {
            fmt.Printf("[%v] Match: %v\n", id, string(line))
        }
    }
}

Execute:

$ go build parallel.go 
$ ./parallel 

The result using 4 workers:

2015/11/20 10:47:21 Started: 2015-11-20 10:47:21.150202228 +0100 CET
[4] Match: 2015-11-19 21:14:40.002737283 +0100 CET -- log entry #111110
[2] Match: 2015-11-20 08:21:20.002737283 +0100 CET -- log entry #151110
[1] Match: 2015-11-24 12:21:20.002737283 +0100 CET -- log entry #511110
[2] Match: 2015-11-24 23:28:00.002737283 +0100 CET -- log entry #551110
[1] Match: 2015-12-01 11:01:10.002737283 +0100 CET -- log entry #1111100
[4] Match: 2015-12-06 02:07:50.002737283 +0100 CET -- log entry #1511100
[2] Match: 2016-01-16 18:07:50.002737283 +0100 CET -- log entry #5111100
[1] Match: 2016-01-21 09:14:30.002737283 +0100 CET -- log entry #5511100
[1] Match: 2016-03-26 04:46:10.002737283 +0100 CET -- log entry #11111000
[2] Match: 2016-05-11 12:52:50.002737283 +0200 CEST -- log entry #15111000
Worker: 4 : Shutting Down
Worker: 1 : Shutting Down
Worker: 3 : Shutting Down
Worker: 2 : Shutting Down
Elapsed time: 54.745756999s

Yes we have improved the result, around 54 seconds, not much but at least something.

Do we get a better result with more workers? Let’s double them: 8 and then 16.

8 workers:

2015/11/20 10:54:30 Started: 2015-11-20 10:54:30.267149314 +0100 CET
[5] Match: 2015-11-19 21:14:40.002737283 +0100 CET -- log entry #111110
[1] Match: 2015-11-20 08:21:20.002737283 +0100 CET -- log entry #151110
[1] Match: 2015-11-24 12:21:20.002737283 +0100 CET -- log entry #511110
[5] Match: 2015-11-24 23:28:00.002737283 +0100 CET -- log entry #551110
[3] Match: 2015-12-01 11:01:10.002737283 +0100 CET -- log entry #1111100
[1] Match: 2015-12-06 02:07:50.002737283 +0100 CET -- log entry #1511100
[1] Match: 2016-01-16 18:07:50.002737283 +0100 CET -- log entry #5111100
[7] Match: 2016-01-21 09:14:30.002737283 +0100 CET -- log entry #5511100
[7] Match: 2016-03-26 04:46:10.002737283 +0100 CET -- log entry #11111000
[1] Match: 2016-05-11 12:52:50.002737283 +0200 CEST -- log entry #15111000
Worker: 2 : Shutting Down
Worker: 5 : Shutting Down
Worker: 8 : Shutting Down
Worker: 1 : Shutting Down
Worker: 3 : Shutting Down
Worker: 7 : Shutting Down
Worker: 4 : Shutting Down
Worker: 6 : Shutting Down
Elapsed time: 37.484420982s

Wow, huge gain. That comes a little bit by surprise as the number of logical processors on my machine is 4, and I would have thought that anything greater than that would just cause overhead. But is shows that Go’s promise, that creating goroutines is cheap, seems to be true.

16 workers:

Elapsed time: 33.005427284s

Still better. Lets increase further:

32 workers:

Elapsed time: 31.268409369s

64 workers:

Elapsed time: 31.390695598s

128 workers:

Elapsed time: 31.204039411s

It looks like we are not gaining much by increasing the number of workers beyond 32.

So here we go: the best we could do in Go (using language default elements and no further tweaking) is around 30 seconds to process 20 million lines with a regular expression. Let’s see how Java performs.

Java implementation

The loop

We are going to use standard Java BufferedReader to process the log file:

BufferedReaderPerformance.java
class BufferedReaderPerformance
{
    public static void main(String[] args)
    {
        Instant start = Instant.now();
        System.out.println("Started: " + start);
        try (InputStream in = Files.newInputStream(Paths.get("../logprocessing-performance-go/log-sample.txt"));
             BufferedReader reader = new BufferedReader(new InputStreamReader(in)))
        {
            String line;
            int i = 0;
            while ((line = reader.readLine()) != null)
            {
                i++;
            }
            System.out.println(i);
        }
        catch (IOException ex)
        {
            System.err.println(ex);
        }
        Instant end = Instant.now();
        System.out.println(Duration.between(start, end));
    }
}

The result:

Started: 2015-11-20T10:10:40.377Z
20000000
PT5.079S

5 seconds. Not as fast as Go, Java is already falling behind.

Sequential

Using the java.util.regex package, the implementation is straight forward:

Sequential.java
class Sequential
{
    public static final Pattern pattern = Pattern.compile("^[^#]*#[15][15]1110+$");

    public static void main(String[] args)
    {
        Instant start = Instant.now();
        System.out.println("Started: " + start);
        try (InputStream in = Files.newInputStream(Paths.get("../logprocessing-performance-go/log-sample.txt"));
             BufferedReader reader = new BufferedReader(new InputStreamReader(in)))
        {
            String line;
            while ((line = reader.readLine()) != null)
            {
                final Matcher matcher = pattern.matcher(line);
                if (matcher.matches())
                {
                    System.out.print(String.format("Match: %s\n", line));
                }
            }
        }
        catch (IOException ex)
        {
            System.err.println(ex);
        }
        Instant end = Instant.now();
        System.out.println(Duration.between(start, end));
    }
}

The result:

Started: 2015-11-20T10:13:10.038Z
Match: 2015-11-19 21:14:40.002737283 +0100 CET -- log entry #111110
Match: 2015-11-20 08:21:20.002737283 +0100 CET -- log entry #151110
Match: 2015-11-24 12:21:20.002737283 +0100 CET -- log entry #511110
Match: 2015-11-24 23:28:00.002737283 +0100 CET -- log entry #551110
Match: 2015-12-01 11:01:10.002737283 +0100 CET -- log entry #1111100
Match: 2015-12-06 02:07:50.002737283 +0100 CET -- log entry #1511100
Match: 2016-01-16 18:07:50.002737283 +0100 CET -- log entry #5111100
Match: 2016-01-21 09:14:30.002737283 +0100 CET -- log entry #5511100
Match: 2016-03-26 04:46:10.002737283 +0100 CET -- log entry #11111000
Match: 2016-05-11 12:52:50.002737283 +0200 CEST -- log entry #15111000
PT18.13S

Holy shmokes! 18 seconds. That comes with a big surprise. Even though the looping takes more time, the Java result clearly outperforms Go.

In this particular case, regular expression processing is done much faster in Java than Go.

Concurrent solution

Let’s see if we can further improve the result. In Java, we don’t have immediate language constructs like Channels to build a queue. But we can quite easily create a Producer/Consumer pattern.

We are going to use java.util.concurrent.ExecutorService to manage thread creation and java.util.concurrent.ArrayBlockingQueue as the queue implementation.

Parallel.java
class Parallel
{
    public static final Pattern pattern = Pattern.compile("^[^#]*#[15][15]1110+$");
    static final int numberOfWorkers = 4;
    static ExecutorService executor = Executors.newFixedThreadPool(numberOfWorkers);
    static final ArrayBlockingQueue<String> queue = new ArrayBlockingQueue<String>(512);

    public static void main(String[] args)
    {
        Instant start = Instant.now();
        System.out.println("Started: " + start);
        try (
                InputStream in = Files.newInputStream(Paths.get("../logprocessing-performance-go/log-sample.txt"));
                BufferedReader reader = new BufferedReader(new InputStreamReader(in))
        )
        {
            // start workers
            for (int i = 0; i < numberOfWorkers; i++)
            {
                Worker worker = new Worker(queue, i);
                executor.submit(worker);
            }

            String line;
            while ((line = reader.readLine()) != null)
            {
                queue.put(line);
            }
            executor.shutdown();
        }
        catch (Exception ex)
        {
            System.err.println(ex);
        }
        Instant end = Instant.now();
        System.out.println(Duration.between(start, end));
    }

    static class Worker implements Runnable
    {
        private final BlockingQueue<String> queue;
        private final int id;
        public Worker(BlockingQueue<String> q, int id)
        {
            queue = q;
            this.id = id;
        }

        public void run()
        {
            try
            {
                while (!Thread.currentThread().isInterrupted())
                {
                    String line = queue.take();
                    final Matcher matcher = pattern.matcher(line);
                    if (matcher.matches())
                    {
                        System.out.print(String.format("[%s] Match: %s\n", id, line));
                    }
                }
            }
            catch (InterruptedException ex)
            {
            }
        }
    }
}

2 workers:

Started: 2015-11-20T10:16:50.941Z
[0] Match: 2015-11-19 21:14:40.002737283 +0100 CET -- log entry #111110
[1] Match: 2015-11-20 08:21:20.002737283 +0100 CET -- log entry #151110
[1] Match: 2015-11-24 12:21:20.002737283 +0100 CET -- log entry #511110
[0] Match: 2015-11-24 23:28:00.002737283 +0100 CET -- log entry #551110
[0] Match: 2015-12-01 11:01:10.002737283 +0100 CET -- log entry #1111100
[0] Match: 2015-12-06 02:07:50.002737283 +0100 CET -- log entry #1511100
[1] Match: 2016-01-16 18:07:50.002737283 +0100 CET -- log entry #5111100
[0] Match: 2016-01-21 09:14:30.002737283 +0100 CET -- log entry #5511100
[0] Match: 2016-03-26 04:46:10.002737283 +0100 CET -- log entry #11111000
[1] Match: 2016-05-11 12:52:50.002737283 +0200 CEST -- log entry #15111000
Done
PT18.278S

4 workers:

PT22.691S

8 workers:

PT26.831S

We don’t gain anything by introducing concurrency in the (already fast) sequential implementation. Threads are expensive in Java.

Conclusion

Java wins. What is nice in Go is the ability to introduce concurrency at little or no cost. It helped to double the performance. On the Java side, the overhead of creating (expensive) threads did not give any performance boost. However, processing regular expressions (the way we have been using them in this test) performs much better in Java.

Recap of the results

Testing machine: MacBook Pro, Intel Core i5, 2.4GHz, 2 cores, 8GB Ram

Input file: plain-text log file with 20.000.000 lines (1.24GB)

Go

Algorithm Settings Execution time
Looping with counter n/a 1.8s
Sequential processing n/a 1m2s
Parallel unbuffered channel number of goroutines: 4 54s
Parallel unbuffered channel number of goroutines: 8 37s
Parallel unbuffered channel number of goroutines: 16 33s
Parallel unbuffered channel number of goroutines: 32 31s
Parallel unbuffered channel number of goroutines: 64 31s
Parallel unbuffered channel number of goroutines: 128 31s

Java

Algorithm Settings Execution time
Looping with counter n/a 5s
Sequential processing n/a 18s
Parallel BlockingQueue number of threads: 2 18s
Parallel BlockingQueue number of threads: 4 22s
Parallel BlockingQueue number of threads: 8 26s

Can you improve those results? If so let me know :)

Thanks for reading

I hope this post was helpful for you. If you have comments, questions or found a bug please let me know, either in the comments below or contact me directly.

Resources

Subscribe to Human Intelligence Engineering
Explorations of human ingenuity in a world of technology.