Data Science with Apache Groovy
Discover why Apache Groovy is an ideal programming language for building data science solutions and learn how to streamline your process using common JDK libraries and build powerful applications that yield valuable insights.
Apache Groovy offers a wealth of features that make it ideal for many data science and big data scenarios.
In this 12-hour workshop, we explore the key benefits of using Groovy to develop data science solutions and demonstrate a variety of strategies for efficiently processing and visualizing data across some common data science problems.
Groovy for Data Science
Some reasons Apache Groovy is an ideal programming language for data science:
- Like Python, Groovy has a dynamic nature, which means it's powerful, easy to learn, and productive. The language gets out of the way and lets data scientists write their algorithms naturally.
- Like Java and Kotlin, Groovy has a static nature, which makes it fast when needed.
- Groovy's close alignment with Java means that you can often just cut-and-paste Java examples from various big data solutions and they’ll work just fine.
- Groovy has first-class functional support, meaning that it offers features and allows solutions similar to Scala. Functional and stream processing with immutable data structures offer many advantages when working in parallel processing or clustered environments.
- Groovy offers integration with many JDK libraries that are commonly used in data science solutions, including libraries for data manipulation, machine learning, and plotting, as well as various big-data solutions for scaling up these algorithms.
This workshop is ideal for existing Groovy or Java developers wanting to boost their data science knowledge or data scientists who wish to explore how using Groovy will enable them to expand their capabilities and enhance their productivity.
Although everyone is welcome, previous exposure to the JVM, Groovy and/or Gradle will be beneficial. Any previous exposure to data science topics will also be beneficial.
This workshop illustrates the key benefits of using Groovy to develop data science solutions and provides an example-rich illustration of many data science problems including:
- Data slicing and visualization
- Linear regression for prediction
- Clustering for grouping
- Natural language processing for language detection and intent matching
- Deep learning for image and digit detection
Math/Data Science libraries covered include:
- Weka, Smile, Apache Commons Math, beakerx notebooks, Deep Learning4J, Apache NLPCraft
Libraries for scaling/concurrency include:
- Apache Spark, Apache Ignite, Apache MXNet, GPars, Apache Beam
While not essential, we strongly recommend attendees have a GitHub account. This provides the most options for carrying out the exercises and allows code samples and documentation to be shared efficiently. If you don't already have one, you can create your free GitHub account at https://github.com/.
Students have two options for running the exercises:
- Students with a GitHub account can run most exercises online using Gitpod. Depending on network connections, this can be a little slow but requires no local software installation.
- Students can also run the examples on their local machines if they have JDK8 or JDK11 installed and are familiar with using Gradle from their favorite IDE. This option requires either cloning the repo or downloading and manually installing the code as a zip. The workshop doesn’t focus on installing the needed JDK or IDE. By all means ask if you are having trouble, but if you want to pursue this option, you’ll get the most out of the workshop if you have those ready to go.
|Jun 28 - Jun 30||Day||King||
Dates & Times
Mon, Jun 28, 1:00pm to 5:00pm
Tue, Jun 29, 1:00pm to 5:00pm
Wed, Jun 30, 1:00pm to 5:00pm
Professional Training for Modern Technology Teams
Open Enrollment Courses
Open enrollment courses are a great, cost-effective option for organizations that have an immediate need to train a small number of employees.
Customized Training Programs
Customized training programs can be delivered on-site, in our training lab, or online to help organizations enhance the skills of their internal development teams.