Scala Benchmark Suite - Introduction

Introduction

The Scala Benchmark Suite is based on the latest release (9.12, nicknamed “Bach”) of the DaCapo benchmark suite, a suite already popular among JVM researchers which specifically strives for “ease of use.” The Scala Benchmark Suite adds 12 Scala benchmarks, summarized in the table below, to the 14 Java benchmarks of the DaCapo benchmark suite.

Benchmark	Description	Inputs (#)
actors	Trading sample with Scala and Akka actors	tiny–gargantuan (6)
apparat	Framework to optimize ABC, SWC, and SWF files	tiny–gargantuan (6)
factorie	Toolkit for deployable probabilistic modeling	tiny–gargantuan (6)
kiama	Library for language processing	small–default (2)
scalac	Compiler for the Scala 2 language	small–large (3)
scaladoc	Scala documentation tool	small–large (3)
scalap	Scala classfile decoder	small–large (3)
scalariform	Code formatter for Scala	tiny–huge (5)
scalatest	Testing toolkit for Scala and Java programmers	small–huge (4)
scalaxb	XML data‐binding tool	tiny–huge (5)
specs	Behaviour‐driven design framework	small–large (3)
tmt	Stanford Topic Modeling Toolbox	tiny–huge (5)

To allow for easy experimentation with different inputs, several benchmarks come with more than the two to four input sizes (small, default, large, and huge) supported by the DaCapo benchmarks. For the Scala Benchmark Suite, this gives rise to 51 unique workloads, i.e., benchmark‐input combinations. The DaCapo benchmark suite offers 44 such workloads.

Covered Application Domains

To ensure its representativeness, the Scala benchmark suite is based on a large set of applications from a range of different domains. In fact, only two categories of application are completely absent from the Scala benchmark suite but present in the latest DaCapo benchmark suite: client/server applications (tomcat, tradebeans, and tradesoap) and in‐memory databases (h2), the former of which is also absent in the DaCapo suite’s earlier release (2006-10). The absence of client/server applications is explained by the fact that all such DaCapo benchmarks rely on either a Servlet container or an application server, a dependency which a corresponding Scala benchmark will have to share. The absence of in‐memory databases is explained by the fact that, to the best of our knowledge, no such Scala application yet exists that is more than a thin wrapper around Java code.

While the range of domains covered is broad, several benchmarks nevertheless occupy the same niche. This was a deliberate choice made to avoid bias from preferring one application over another in a domain where Scala is frequently used: automated testing (scalatest, specs), source‐code processing (scaladoc, scalariform), or machine‐learning (factorie, tmt). It has been shown that the inclusion of several applications from the same domain is indeed justified; in particular, the respective benchmarks all exhibit a distinct instruction mix.

Project Documentation

Core Projects

Introduction

Covered Application Domains