Discussion:
Automatic scaling and downsizing of storm cluster
Shashank Prasad
2017-07-13 22:10:24 UTC
Permalink
Hi Folks,

We have been using storm for sometime now and it servers our purpose very
well. The nature of our business is such that, we process high volume of
data during only certain hours of the day, the rest of the time it receives
very low volume of data and sometimes it completely sits idle, receiving no
data at all. Since our infrastructure is on AWS, we are paying by the
hour.

Ideally what i would like is automatic scaling and downsizing of storm
supervisors depending on the traffic it receives. I am wondering if anyone
of you guys have experienced something similar and have implemented an
auto-scaling de-scaling method. If so, how are you going about it? what
applications are you using to achieve it?

Thank you for you time!

-shashank
Shashank Prasad
2017-07-13 23:15:11 UTC
Permalink
Hi Folks,

We have been using storm for sometime now and it servers our purpose very
well. The nature of our business is such that, we process high volume of
data during only certain hours of the day, the rest of the time it receives
very low volume of data and sometimes it completely sits idle, receiving no
data at all. Since our infrastructure is on AWS, we are paying by the
hour.

Ideally what i would like is automatic scaling and downsizing of storm
supervisors depending on the traffic it receives. I am wondering if anyone
of you guys have experienced something similar and have implemented an
auto-scaling de-scaling method. If so, how are you going about it? what
applications are you using to achieve it?

Thank you for you time!

-shashank
Ambud Sharma
2017-07-17 03:45:44 UTC
Permalink
Note: be careful if you topology has state, Storm topologies may be
stateful in which case autoscaling (up or down) could lead to data
inconsistencies

You can perform this using Hortonworks Cloudbreak to perform auto scaling
based on configurable thresholds; you can also try to put something
together using custom deployment scripts and auto scaling groups.

This is also doable in dockerized environments (compose/swarm or
kubernetes) with careful configuration and image building.
Post by Shashank Prasad
Hi Folks,
We have been using storm for sometime now and it servers our purpose very
well. The nature of our business is such that, we process high volume of
data during only certain hours of the day, the rest of the time it receives
very low volume of data and sometimes it completely sits idle, receiving no
data at all. Since our infrastructure is on AWS, we are paying by the
hour.
Ideally what i would like is automatic scaling and downsizing of storm
supervisors depending on the traffic it receives. I am wondering if anyone
of you guys have experienced something similar and have implemented an
auto-scaling de-scaling method. If so, how are you going about it? what
applications are you using to achieve it?
Thank you for you time!
-shashank
Loading...