Streams in Scala
In this lesson, you will learn about Streams in Scala. Streams are used to process continuous or large flows of data efficiently and safely, often in concurrent or asynchronous environments.
Streams allow you to handle data step by step instead of loading everything into memory at once, making them ideal for real-time and high-throughput systems.
What Is a Stream?
A stream represents a sequence of data elements that are produced and consumed over time.
- Data flows through a pipeline
- Elements are processed one at a time
- Memory usage stays low
Streams are widely used in event processing, data pipelines, and reactive systems.
Why Use Streams?
Streams solve problems that traditional collections cannot handle efficiently.
- Processing large datasets
- Handling real-time data
- Backpressure control
- Composable processing stages
In Scala, streams are often implemented using libraries like Akka Streams.
Core Concepts of Akka Streams
- Source – produces data
- Flow – transforms data
- Sink – consumes data
These components are combined to form a stream pipeline.
Adding Akka Streams Dependency
To use Akka Streams, add the dependency to your project.
libraryDependencies += "com.typesafe.akka" %% "akka-stream" % "2.6.21"
Creating a Source
A Source is the starting point of a stream.
import akka.stream.scaladsl.Source
val source = Source(1 to 5)
This Source emits numbers from 1 to 5.
Creating a Sink
A Sink consumes the data from a stream.
import akka.stream.scaladsl.Sink
val sink = Sink.foreach[Int](println)
Connecting Source and Sink
To run a stream, connect a Source to a Sink.
import akka.actor.ActorSystem
import akka.stream.ActorMaterializer
implicit val system = ActorSystem("StreamSystem")
implicit val materializer = ActorMaterializer()
source.runWith(sink)
This will print numbers as they flow through the stream.
Using Flow for Transformation
A Flow transforms elements as they pass through.
import akka.stream.scaladsl.Flow
val flow = Flow[Int].map(_ * 2)
Complete Stream Pipeline
Here is a full stream pipeline using Source, Flow, and Sink.
Source(1 to 5)
.via(Flow[Int].map(_ * 2))
.runWith(Sink.foreach(println))
Each number is doubled before being printed.
Backpressure Explained
Backpressure ensures that fast producers do not overwhelm slow consumers.
- Producers slow down automatically
- System remains stable
- No data loss
Akka Streams handles backpressure automatically.
Asynchronous Processing
Streams can process data asynchronously using mapAsync.
Source(1 to 5)
.mapAsync(2)(x => Future(x * 2))
.runWith(Sink.foreach(println))
This allows parallel processing while maintaining order.
Error Handling in Streams
Streams can recover from errors gracefully.
Source(1 to 5)
.map(x => if (x == 3) throw new Exception("Error") else x)
.recover {
case _: Exception => -1
}
.runWith(Sink.foreach(println))
Real-World Use Cases
- Log processing
- Event streaming
- Data ingestion pipelines
- Streaming APIs
📝 Practice Exercises
Exercise 1
Create a stream that prints numbers from 1 to 10.
Exercise 2
Transform a stream to square each number.
Exercise 3
Add error handling to a stream.
✅ Practice Answers
Answer 1
Source(1 to 10).runWith(Sink.foreach(println))
Answer 2
Source(1 to 5)
.map(x => x * x)
.runWith(Sink.foreach(println))
Answer 3
Source(1 to 5)
.map(x => if (x == 4) throw new Exception("Fail") else x)
.recover { case _ => 0 }
.runWith(Sink.foreach(println))
What’s Next?
In the next lesson, you will learn about HTTP Requests and how Scala communicates with external services.