Data Science with Apache Groovy

Data Science with Apache Groovy

Discover why Apache Groovy is an ideal programming language for building data science solutions and learn how to streamline your process using common JDK libraries and build powerful applications that yield valuable insights.


Apache Groovy offers a wealth of features that make it ideal for many data science and big data scenarios.

In this 12-hour workshop, we explore the key benefits of using Groovy to develop data science solutions and demonstrate a variety of strategies for efficiently processing and visualizing data across some common data science problems.

2021 Open-Enrollment workshop

Dates: June 28 to June 30

Time: 1:00 p.m. to 5:00 p.m. CDT

Duration: 12 hours (4 hours per day for 3 days)

Location: Online

Instructor: Paul King

Registration Fee: $199.00 USD

Enrollment for the 2021 open enrollment offering is closed. Contact us to set up a custom workshop for your team.

Groovy for Data Science

Some reasons Apache Groovy is an ideal programming language for data science:

  • Like Python, Groovy has a dynamic nature, which means it's powerful, easy to learn, and productive. The language gets out of the way and lets data scientists write their algorithms naturally.
  • Like Java and Kotlin, Groovy has a static nature, which makes it fast when needed.
  • Groovy's close alignment with Java means that you can often just cut-and-paste Java examples from various big data solutions and they’ll work just fine.
  • Groovy has first-class functional support, meaning that it offers features and allows solutions similar to Scala. Functional and stream processing with immutable data structures offer many advantages when working in parallel processing or clustered environments.
  • Groovy offers integration with many JDK libraries that are commonly used in data science solutions, including libraries for data manipulation, machine learning, and plotting, as well as various big-data solutions for scaling up these algorithms.

Intended Audience

This workshop is ideal for existing Groovy or Java developers wanting to boost their data science knowledge or data scientists who wish to explore how using Groovy will enable them to expand their capabilities and enhance their productivity.


Although everyone is welcome, previous exposure to the JVM, Groovy and/or Gradle will be beneficial. Any previous exposure to data science topics will also be beneficial.

This is a live, hands-on training course. No recording of this training event will be made available for on-demand consumption.


This workshop illustrates the key benefits of using Groovy to develop data science solutions and provides an example-rich illustration of many data science problems including:

  • Data slicing and visualization
  • Linear regression for prediction
  • Clustering for grouping
  • Natural language processing for language detection and intent matching
  • Deep learning for image and digit detection

Math/Data Science libraries covered include:

  • Weka, Smile, Apache Commons Math, beakerx notebooks, Deep Learning4J, Apache NLPCraft

Libraries for scaling/concurrency include:

  • Apache Spark, Apache Ignite, Apache MXNet, GPars, Apache Beam

Technical Requirements

While not essential, we strongly recommend attendees have a GitHub account. This provides the most options for carrying out the exercises and allows code samples and documentation to be shared efficiently. If you don't already have one, you can create your free GitHub account at

Students have two options for running the exercises:

  • Students with a GitHub account can run most exercises online using Gitpod. Depending on network connections, this can be a little slow but requires no local software installation.
  • Students can also run the examples on their local machines if they have JDK8 or JDK11 installed and are familiar with using Gradle from their favorite IDE. This option requires either cloning the repo or downloading and manually installing the code as a zip. The workshop doesn’t focus on installing the needed JDK or IDE. By all means ask if you are having trouble, but if you want to pursue this option, you’ll get the most out of the workshop if you have those ready to go.

Upcoming Offerings

Classes currently being scheduled. Contact us to set up yours!

Professional Training for Modern Technology Teams

Sign up today for open enrollment technology training.

Open Enrollment Courses

Open enrollment courses are a great, cost-effective option for organizations that have an immediate need to train a small number of employees.

View Course Schedule

Customized technology training designed to meet your team's specific needs

Customized Training Programs

Customized training programs can be delivered on-site, in our training lab, or online to help organizations enhance the skills of their internal development teams.

Request More Information