Scala Lesson 45 – Streams | Dataplexa

Streams in Scala

In this lesson, you will learn about Streams in Scala. Streams are used to process continuous or large flows of data efficiently and safely, often in concurrent or asynchronous environments.

Streams allow you to handle data step by step instead of loading everything into memory at once, making them ideal for real-time and high-throughput systems.


What Is a Stream?

A stream represents a sequence of data elements that are produced and consumed over time.

  • Data flows through a pipeline
  • Elements are processed one at a time
  • Memory usage stays low

Streams are widely used in event processing, data pipelines, and reactive systems.


Why Use Streams?

Streams solve problems that traditional collections cannot handle efficiently.

  • Processing large datasets
  • Handling real-time data
  • Backpressure control
  • Composable processing stages

In Scala, streams are often implemented using libraries like Akka Streams.


Core Concepts of Akka Streams

  • Source – produces data
  • Flow – transforms data
  • Sink – consumes data

These components are combined to form a stream pipeline.


Adding Akka Streams Dependency

To use Akka Streams, add the dependency to your project.

libraryDependencies += "com.typesafe.akka" %% "akka-stream" % "2.6.21"

Creating a Source

A Source is the starting point of a stream.

import akka.stream.scaladsl.Source

val source = Source(1 to 5)

This Source emits numbers from 1 to 5.


Creating a Sink

A Sink consumes the data from a stream.

import akka.stream.scaladsl.Sink

val sink = Sink.foreach[Int](println)

Connecting Source and Sink

To run a stream, connect a Source to a Sink.

import akka.actor.ActorSystem
import akka.stream.ActorMaterializer

implicit val system = ActorSystem("StreamSystem")
implicit val materializer = ActorMaterializer()

source.runWith(sink)

This will print numbers as they flow through the stream.


Using Flow for Transformation

A Flow transforms elements as they pass through.

import akka.stream.scaladsl.Flow

val flow = Flow[Int].map(_ * 2)

Complete Stream Pipeline

Here is a full stream pipeline using Source, Flow, and Sink.

Source(1 to 5)
  .via(Flow[Int].map(_ * 2))
  .runWith(Sink.foreach(println))

Each number is doubled before being printed.


Backpressure Explained

Backpressure ensures that fast producers do not overwhelm slow consumers.

  • Producers slow down automatically
  • System remains stable
  • No data loss

Akka Streams handles backpressure automatically.


Asynchronous Processing

Streams can process data asynchronously using mapAsync.

Source(1 to 5)
  .mapAsync(2)(x => Future(x * 2))
  .runWith(Sink.foreach(println))

This allows parallel processing while maintaining order.


Error Handling in Streams

Streams can recover from errors gracefully.

Source(1 to 5)
  .map(x => if (x == 3) throw new Exception("Error") else x)
  .recover {
    case _: Exception => -1
  }
  .runWith(Sink.foreach(println))

Real-World Use Cases

  • Log processing
  • Event streaming
  • Data ingestion pipelines
  • Streaming APIs

📝 Practice Exercises


Exercise 1

Create a stream that prints numbers from 1 to 10.

Exercise 2

Transform a stream to square each number.

Exercise 3

Add error handling to a stream.


✅ Practice Answers


Answer 1

Source(1 to 10).runWith(Sink.foreach(println))

Answer 2

Source(1 to 5)
  .map(x => x * x)
  .runWith(Sink.foreach(println))

Answer 3

Source(1 to 5)
  .map(x => if (x == 4) throw new Exception("Fail") else x)
  .recover { case _ => 0 }
  .runWith(Sink.foreach(println))

What’s Next?

In the next lesson, you will learn about HTTP Requests and how Scala communicates with external services.